Robotic system for object size measurement

ABSTRACT

Systems and methods for object detection and robotic control or pickup of the detected objects are provided. Systems and methods described herein provide for the determination and adjustment of minimum viable regions on a surface of an object for robotic pickup. Determination of a minimum viable region may increase the accuracy and efficiency of robotic object handling. The minimum viable region may be used to select grasping areas for a movement operation configured to provide supplemental image information for estimating dimensions of a target object.

CROSS-REFERENCE TO RELATED APPLICATION(S)

The present application claims the benefit of U.S. Provisional Appl. No.63/189,743, entitled “A ROBOTIC SYSTEM FOR OBJECT SIZE MEASUREMENT ORMINIMUM VIABLE REGION DETECTION” and filed May 18, 2021, the entirecontent of which is incorporated by reference herein.

FIELD OF THE INVENTION

The present technology is directed generally to robotic systems and,more specifically, to systems, processes, and techniques for performingobject size measurement or estimation and/or minimum viable regiondetection.

BACKGROUND

With their ever-increasing performance and lowering cost, many robots(e.g., machines configured to automatically/autonomously executephysical actions) are now extensively used in various different fields.Robots, for example, can be used to execute various tasks (e.g.,manipulate or transfer an object through space) in manufacturing and/orassembly, packing and/or packaging, transport and/or shipping, etc. Inexecuting the tasks, the robots can replicate human actions, therebyreplacing or reducing human involvements that are otherwise required toperform dangerous or repetitive tasks.

BRIEF SUMMARY

According to an embodiment hereof, a computing system comprising atleast one processing circuit in communication with a robot, having anarm and an end-effector connected thereto, and a camera having a fieldof view and configured, when one or more objects are or have been in thefield of view, to execute instructions stored on a non-transitorycomputer-readable medium is provided. The instructions are for obtaininginitial image information of one or more objects, wherein the initialimage information is generated by the camera; detecting a plurality ofcorners of the one or more objects based on the initial imageinformation; identifying a target open corner of a target object fromthe plurality of corners; defining a minimum viable region (MVR) for thetarget object; defining a non-occlusion area based on the minimum viableregion; transmitting a positioning command for positioning the arm ofthe robot; transmitting a grasping command for grabbing the targetobject within the minimum viable region; transmitting a movement commandfor moving the target object based on a movement direction, a movementdistance, and a movement type, using the arm of the robot; obtainingsupplemental image information of the one or more objects; andcalculating estimated dimensions for the target object based on thesupplemental image information, wherein at least one of the positioningcommand, the grasping command, and the movement command are configuredto prevent the arm of the robot from blocking a non-occlusion area ofthe one or more objects.

A further embodiment provides a method of controlling a robotic systemcomprising a non-transitory computer-readable medium, at least oneprocessing circuit in communication with a camera having a field of viewand configured to execute instructions. The method comprises obtaininginitial image information of one or more objects, wherein the initialimage information is generated by the camera; detecting a plurality ofcorners of the one or more objects based on the initial imageinformation; identifying a target open corner of a target object fromthe plurality of corners; defining a minimum viable region (MVR) for thetarget object; defining a non-occlusion area based on the minimum viableregion; transmitting a positioning command for positioning the arm ofthe robot; transmitting a grasping command for grabbing the targetobject within the minimum viable region; transmitting a movement commandfor moving the target object based on a movement direction, a movementdistance, and a movement type, using the arm of the robot; obtainingsupplemental image information of the one or more objects; andcalculating estimated dimensions for the target object based on thesupplemental image information, wherein at least one of the positioningcommand, the grasping command, and the movement command are configuredto prevent the arm of the robot from blocking a non-occlusion area ofthe one or more objects.

In a further embodiment, a non-transitory computer-readable medium isprovided. The non-transitory computer-readable medium includesinstructions for execution by at least one processing circuit incommunication with a camera having a field of view and configured, whenone or more objects are or have been in the field of view, theinstructions being configured for: obtaining initial image informationof one or more objects, wherein the initial image information isgenerated by the camera; detecting a plurality of corners of the one ormore objects based on the initial image information; identifying atarget open corner of a target object from the plurality of corners;defining a minimum viable region (MVR) for the target object; defining anon-occlusion area based on the minimum viable region; transmitting apositioning command for positioning the arm of the robot; transmitting agrasping command for grabbing the target object within the minimumviable region; transmitting a movement command for moving the targetobject based on a movement direction, a movement distance, and amovement type, using the arm of the robot; obtaining supplemental imageinformation of the one or more objects; and calculating estimateddimensions for the target object based on the supplemental imageinformation, wherein at least one of the positioning command, thegrasping command, and the movement command are configured to prevent thearm of the robot from blocking a non-occlusion area of the one or moreobjects.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1D illustrate systems for performing or facilitating defining aminimum viable region, consistent with embodiments hereof.

FIGS. 2A-2D provide block diagrams that illustrate a computing systemconfigured to perform or facilitate defining a minimum viable region,consistent with embodiments hereof

FIGS. 2E-2F provide examples of image information processed by systemsand consistent with embodiments hereof.

FIGS. 3A-3H illustrate aspects of defining a minimum viable region,according to embodiments hereof.

FIG. 4 provides a flow diagram that illustrates a method of defining aminimum viable region, according to an embodiment hereof.

FIG. 5 provides a flow diagram that illustrates a method of estimatingdimensions of a target object, according to an embodiment hereof.

FIGS. 6A and 6B illustrate aspects of estimating dimensions of a targetobject, according to embodiments hereof.

DETAILED DESCRIPTION

Systems and methods for a robotic system with a coordinated transfermechanism are described herein. The robotic system (e.g., an integratedsystem of devices that each execute one or more designated tasks)configured in accordance with some embodiments autonomously executesintegrated tasks by coordinating operations of multiple units (e.g.,robots).

The technology described herein provides technical improvements to theexisting computer-based image recognition and robotic control fields.Technical improvements provide an increase in overall speed andreliability of identifying gripping portions of objects to increase theefficiency and reliability of robotic interactions with the objects.Using image information to determine and differentiate objects presentwithin a field of view of a camera, the process described herein furtherimproves existing image recognition through the use of movement of theobjects to adjust and assist in the identification of potential grippingportions of one target object.

In particular, the present technology described herein assists a roboticsystem to interact with a particular object among a plurality ofobjects, when identification of the dimensions and positions of eachobject are unknown or known with incomplete accuracy. For example, if aplurality of objects are positioned flush with one another, existingcomputer-based image recognition may have difficulty in identifying eachobject and reliably and accurately instructing a robotic system on howto interact with the objects. In particular, if the dimensions of theobjects are not accurately identified, it may not be clear to a roboticsystem where one object ends and another begins. Thus, the system risksattempting to grasp an object at a location where it intersects withother objects. In such a case, the system may fail to grasp eitherobject. Although the exact dimensions of an object may not be known withcomplete accuracy, the systems and methods provided herein provide theability to quickly and reliably identify at least a portion, e.g., aminimum viable region, of an object that may be grasped by the robot armwithout the need to identify or determine the exact edges of the object.Further, the system may be configured to adjust a location at which anobject is grabbed. If an object is grabbed at certain locations (e.g.,off- center locations), transporting the object may be difficult. Systemand methods provided herein may use movement of the object after initialgrasping by the robot arm to determine the exact dimensions of theobject and adjust or alter how the robot interacts with the object basedon the updated dimensions.

In the following, specific details are set forth to provide anunderstanding of the presently disclosed technology. In embodiments, thetechniques introduced here may be practiced without including eachspecific detail disclosed herein. In other instances, well-knownfeatures, such as specific functions or routines, are not described indetail to avoid unnecessarily obscuring the present disclosure.References in this description to “an embodiment,” “one embodiment,” orthe like mean that a particular feature, structure, material, orcharacteristic being described is included in at least one embodiment ofthe present disclosure. Thus, the appearances of such phrases in thisspecification do not necessarily all refer to the same embodiment. Onthe other hand, such references are not necessarily mutually exclusiveeither. Furthermore, the particular features, structures, materials, orcharacteristics can be combined in any suitable manner in one or moreembodiments. It is to be understood that the various embodiments shownin the figures are merely illustrative representations and are notnecessarily drawn to scale.

Several details describing structures or processes that are well-knownand often associated with robotic systems and subsystems, but that canunnecessarily obscure some significant aspects of the disclosedtechniques, are not set forth in the following description for purposesof clarity. Moreover, although the following disclosure sets forthseveral embodiments of different aspects of the present technology,several other embodiments may have different configurations or differentcomponents than those described in this section. Accordingly, thedisclosed techniques may have other embodiments with additional elementsor without several of the elements described below.

Many embodiments or aspects of the present disclosure described belowmay take the form of computer- or controller-executable instructions,including routines executed by a programmable computer or controller.Those skilled in the relevant art will appreciate that the disclosedtechniques can be practiced on or with computer or controller systemsother than those shown and described below. The techniques describedherein can be embodied in a special-purpose computer or data processorthat is specifically programmed, configured, or constructed to executeone or more of the computer-executable instructions described below.Accordingly, the terms “computer” and “controller” as generally usedherein refer to any data processor and can include Internet appliancesand handheld devices (including palm-top computers, wearable computers,cellular or mobile phones, multi-processor systems, processor-based orprogrammable consumer electronics, network computers, minicomputers, andthe like). Information handled by these computers and controllers can bepresented at any suitable display medium, including a liquid crystaldisplay (LCD). Instructions for executing computer- orcontroller-executable tasks can be stored in or on any suitablecomputer-readable medium, including hardware, firmware, or a combinationof hardware and firmware. Instructions can be contained in any suitablememory device, including, for example, a flash drive, USB device, and/orother suitable medium.

The terms “coupled” and “connected,” along with their derivatives, canbe used herein to describe structural relationships between components.It should be understood that these terms are not intended as synonymsfor each other. Rather, in particular embodiments, “connected” can beused to indicate that two or more elements are in direct contact witheach other. Unless otherwise made apparent in the context, the term“coupled” can be used to indicate that two or more elements are ineither direct or indirect (with other intervening elements between them)contact with each other, or that the two or more elements co-operate orinteract with each other (e.g., as in a cause-and-effect relationship,such as for signal transmission/reception or for function calls), orboth.

Any reference herein to image analysis by a computing system may beperformed according to or using spatial structure information that mayinclude depth information which describes respective depth value ofvarious locations relative a chosen point. The depth information may beused to identify objects or estimate how objects are spatially arranged.In some instances, the spatial structure information may include or maybe used to generate a point cloud that describes locations of one ormore surfaces of an object. Spatial structure information is merely oneform of possible image analysis and other forms known by one skilled inthe art may be used in accordance with the methods described herein.

FIG. 1A illustrates a system 1500 for performing object detection, or,more specifically, object recognition. More particularly, the system1500 may include a computing system 1100 and a camera 1200. In thisexample, the camera 1200 may be configured to generate image informationwhich describes or otherwise represents an environment in which thecamera 1200 is located, or, more specifically, represents an environmentin the camera's 1200 field of view (also referred to as a camera fieldof view). The environment may be, e.g., a warehouse, a manufacturingplant, a retail space, or other premises. In such instances, the imageinformation may represent objects located at such premises, such asboxes, bins, cases, crates, or other containers. The system 1500 may beconfigured to generate, receive, and/or process the image information,such as by using the image information to distinguish between individualobjects in the camera field of view, to perform object recognition orobject registration based on the image information, and/or perform robotinteraction planning based on the image information, as discussed belowin more detail (the terms “and/or” and “or” are used interchangeably inthis disclosure). The robot interaction planning may be used to, e.g.,control a robot at the premises to facilitate robot interaction betweenthe robot and the containers or other objects. The computing system 1100and the camera 1200 may be located at the same premises or may belocated remotely from each other. For instance, the computing system1100 may be part of a cloud computing platform hosted in a data centerwhich is remote from the warehouse or retail space and may becommunicating with the camera 1200 via a network connection.

In an embodiment, the camera 1200 (which may also be referred to as animage sensing device) may be a 2D camera and/or a 3D camera. Forexample, FIG. 1B illustrates a system 1500A (which may be an embodimentof the system 1500) that includes the computing system 1100 as well as acamera 1200A and a camera 1200B, both of which may be an embodiment ofthe camera 1200. In this example, the camera 1200A may be a 2D camerathat is configured to generate 2D image information which includes orforms a 2D image that describes a visual appearance of the environmentin the camera's field of view. The camera 1200B may be a 3D camera (alsoreferred to as a spatial structure sensing camera or spatial structuresensing device) that is configured to generate 3D image informationwhich includes or forms spatial structure information regarding anenvironment in the camera's field of view. The spatial structureinformation may include depth information (e.g., a depth map) whichdescribes respective depth values of various locations relative to thecamera 1200B, such as locations on surfaces of various objects in thecamera 1200's field of view. These locations in the camera's field ofview or on an object's surface may also be referred to as physicallocations. The depth information in this example may be used to estimatehow the objects are spatially arranged in three-dimensional (3D) space.In some instances, the spatial structure information may include or maybe used to generate a point cloud that describes locations on one ormore surfaces of an object in the camera 1200B's field of view. Morespecifically, the spatial structure information may describe variouslocations on a structure of the object (also referred to as an objectstructure).

In an embodiment, the system 1500 may be a robot operation system forfacilitating robot interaction between a robot and various objects inthe environment of the camera 1200. For example, FIG. 1C illustrates arobot operation system 1500B, which may be an embodiment of the system1500/1500A of FIGS. 1A and 1B. The robot operation system 1500B mayinclude the computing system 1100, the camera 1200, and a robot 1300. Asstated above, the robot 1300 may be used to interact with one or moreobjects in the environment of the camera 1200, such as with boxes,crates, bins, or other containers. For example, the robot 1300 may beconfigured to pick up the containers from one location and move them toanother location. In some cases, the robot 1300 may be used to perform ade-palletization operation in which a group of containers or otherobjects are unloaded and moved to, e.g., a conveyor belt. In someimplementations, the camera 1200 may be attached to the robot 1300, suchas to a robot arm 3320 of the robot 1300. In some implementations, thecamera 1200 may be separate from the robot 1300. For instance, thecamera 1200 may be mounted to a ceiling of a warehouse or otherstructure and may remain stationary relative to the structure.

In an embodiment, the computing system 1100 of FIGS. 1A-1C may form orbe integrated into the robot 1300, which may also be referred to as arobot controller. A robot control system may be included in the system1500B, and is configured to e.g., generate commands for the robot 1300,such as a robot interaction movement command for controlling robotinteraction between the robot 1300 and a container or other object. Insuch an embodiment, the computing system 1100 may be configured togenerate such commands based on, e.g., image information generated bythe camera 1200. For instance, the computing system 1100 may beconfigured to determine a motion plan based on the image information,wherein the motion plan may be intended for, e.g., gripping or otherwisepicking up an object. The computing system 1100 may generate one or morerobot interaction movement commands to execute the motion plan.

In an embodiment, the computing system 1100 may form or be part of avision system. The vision system may be a system which generates, e.g.,vision information which describes an environment in which the robot1300 is located, or, alternatively or in addition to, describes anenvironment in which the camera 1200 is located. The vision informationmay include the 3D image information and/or the 2D image informationdiscussed above, or some other image information. In some scenarios, ifthe computing system 1100 forms a vision system, the vision system maybe part of the robot control system discussed above or may be separatefrom the robot control system. If the vision system is separate from therobot control system, the vision system may be configured to outputinformation describing the environment in which the robot 1300 islocated. The information may be outputted to the robot control system,which may receive such information from the vision system and performsmotion planning and/or generates robot interaction movement commandsbased on the information. Further information regarding the visionsystem is detailed below.

In an embodiment, the computing system 1100 may communicate with thecamera 1200 and/or with the robot 1300 via a direct connection, such asa connection provided via a dedicated wired communication interface,such as a RS-232 interface, a universal serial bus (USB) interface,and/or via a local computer bus, such as a peripheral componentinterconnect (PCI) bus. In an embodiment, the computing system 1100 maycommunicate with the camera 1200 and/or with the robot 1300 via anetwork. The network may be any type and/or form of network, such as apersonal area network (PAN), a local-area network (LAN), e.g., Intranet,a metropolitan area network (MAN), a wide area network (WAN), or theInternet. The network may utilize different techniques and layers orstacks of protocols, including, e.g., the Ethernet protocol, theinternet protocol suite (TCP/IP), the ATM (Asynchronous Transfer Mode)technique, the SONET (Synchronous Optical Networking) protocol, or theSDH (Synchronous Digital Hierarchy) protocol.

In an embodiment, the computing system 1100 may communicate informationdirectly with the camera 1200 and/or with the robot 1300, or maycommunicate via an intermediate storage device, or more generally anintermediate non-transitory computer-readable medium. For example, FIG.1D illustrates a system 1500C, which may be an embodiment of the system1500/1500A/1500B, that includes a non-transitory computer-readablemedium 1400, which may be external to the computing system 1100, and mayact as an external buffer or repository for storing, e.g., imageinformation generated by the camera 1200. In such an example, thecomputing system 1100 may retrieve or otherwise receive the imageinformation from the non-transitory computer-readable medium 1400.Examples of the non-transitory computer readable medium 1400 include anelectronic storage device, a magnetic storage device, an optical storagedevice, an electromagnetic storage device, a semiconductor storagedevice, or any suitable combination thereof. The non-transitorycomputer-readable medium may form, e.g., a computer diskette, a harddisk drive (HDD), a solid-state drive (SDD), a random access memory(RAM), a read-only memory (ROM), an erasable programmable read-onlymemory (EPROM or Flash memory), a static random access memory (SRAM), aportable compact disc read-only memory (CD-ROM), a digital versatiledisk (DVD), and/or a memory stick.

As stated above, the camera 1200 may be a 3D camera and/or a 2D camera.The 2D camera may be configured to generate a 2D image, such as a colorimage or a grayscale image. The 3D camera may be, e.g., a depth-sensingcamera, such as a time-of-flight (TOF) camera or a structured lightcamera, or any other type of 3D camera. In some cases, the 2D cameraand/or 3D camera may include an image sensor, such as a charge coupleddevices (CCDs) sensor and/or complementary metal oxide semiconductors(CMOS) sensor. In an embodiment, the 3D camera may include lasers, aLIDAR device, an infrared device, a light/dark sensor, a motion sensor,a microwave detector, an ultrasonic detector, a RADAR detector, or anyother device configured to capture depth information or other spatialstructure information.

As stated above, the image information may be processed by the computingsystem 1100. In an embodiment, the computing system 1100 may include orbe configured as a server (e.g., having one or more server blades,processors, etc.), a personal computer (e.g., a desktop computer, alaptop computer, etc.), a smartphone, a tablet computing device, and/orother any other computing system. In an embodiment, any or all of thefunctionality of the computing system 1100 may be performed as part of acloud computing platform. The computing system 1100 may be a singlecomputing device (e.g., a desktop computer), or may include multiplecomputing devices.

FIG. 2A provides a block diagram that illustrates an embodiment of thecomputing system 1100. The computing system 1100 in this embodimentincludes at least one processing circuit 1110 and a non-transitorycomputer-readable medium (or media) 1120. In some instances, theprocessing circuit 1110 may include processors (e.g., central processingunits (CPUs), special-purpose computers, and/or onboard servers)configured to execute instructions (e.g., software instructions) storedon the non-transitory computer-readable medium 1120 (e.g., computermemory). In some embodiments, the processors may be included in aseparate/stand-alone controller that is operably coupled to the otherelectronic/electrical devices. The processors may implement the programinstructions to control/interface with other devices, thereby causingthe computing system 1100 to execute actions, tasks, and/or operations.In an embodiment, the processing circuit 1110 includes one or moreprocessors, one or more processing cores, a programmable logiccontroller (“PLC”), an application specific integrated circuit (“ASIC”),a programmable gate array (“PGA”), a field programmable gate array(“FPGA”), any combination thereof, or any other processing circuit.

In an embodiment, the non-transitory computer-readable medium 1120,which is part of the computing system 1100, may be an alternative oraddition to the intermediate non-transitory computer-readable medium1400 discussed above. The non-transitory computer-readable medium 1120may be a storage device, such as an electronic storage device, amagnetic storage device, an optical storage device, an electromagneticstorage device, a semiconductor storage device, or any suitablecombination thereof, for example, such as a computer diskette, a harddisk drive (HDD), a solid state drive (SSD), a random access memory(RAM), a read-only memory (ROM), an erasable programmable read-onlymemory (EPROM or Flash memory), a static random access memory (SRAM), aportable compact disc read-only memory (CD-ROM), a digital versatiledisk (DVD), a memory stick, any combination thereof, or any otherstorage device. In some instances, the non-transitory computer-readablemedium 1120 may include multiple storage devices. In certainimplementations, the non-transitory computer-readable medium 1120 isconfigured to store image information generated by the camera 1200 andreceived by the computing system 1100. In some instances, thenon-transitory computer-readable medium 1120 may store one or more modeltemplates used for performing an object recognition operation. Thenon-transitory computer-readable medium 1120 may alternatively oradditionally store computer readable program instructions that, whenexecuted by the processing circuit 1110, causes the processing circuit1110 to perform one or more methodologies described here.

FIG. 2B depicts a computing system 1100A that is an embodiment of thecomputing system 1100 and includes a communication interface 1130. Thecommunication interface 1130 may be configured to, e.g., receive imageinformation generated by the camera 1200 of FIGS. 1A-1D. The imageinformation may be received via the intermediate non-transitorycomputer-readable medium 1400 or the network discussed above, or via amore direct connection between the camera 1200 and the computing system1100/1100A. In an embodiment, the communication interface 1130 may beconfigured to communicate with the robot 1300 of FIG. 1C. If thecomputing system 1100 is external to a robot control system, thecommunication interface 1130 of the computing system 1100 may beconfigured to communicate with the robot control system. Thecommunication interface 1130 may also be referred to as a communicationcomponent or communication circuit, and may include, e.g., acommunication circuit configured to perform communication over a wiredor wireless protocol. As an example, the communication circuit mayinclude a RS-232 port controller, a USB controller, an Ethernetcontroller, a Bluetooth® controller, a PCI bus controller, any othercommunication circuit, or a combination thereof.

In an embodiment, as depicted in FIG. 2C, the non-transitorycomputer-readable medium 1120 may include a storage space 1128configured to store one or more data objects discussed herein. Forexample, the storage space may store model templates, robotic arm movecommands, and any additional data objects the computing system 1100B mayrequire access to.

In an embodiment, the processing circuit 1110 may be programmed by oneor more computer-readable program instructions stored on thenon-transitory computer-readable medium 1120. For example, FIG. 2Dillustrates a computing system 1100C, which is an embodiment of thecomputing system 1100/1100A/1100B, in which the processing circuit 1110is programmed by one or more modules, including an object recognitionmodule 1121, a minimum viable region (MVR) module 1122, and a motionplanning module 1129.

In an embodiment, the object recognition module 1121 may be configuredto obtain and analyze image information as discussed throughout thedisclosure. Methods, systems, and techniques discussed herein withrespect to image information may use the object recognition module.

The MVR determination module 1122 may be configured calculate,determine, and/or identify minimum viable regions according to imageinformation and analysis performed or obtained by the object recognitionmodule 1121. Methods, systems, and techniques discussed herein withrespect to MVR determination may be performed by the MVR determinationmodule 1122.

The motion planning module 1129 may be configured plan the movement of arobot. For example, the motion planning module 1129 may deriveindividual placement locations/orientations, calculate correspondingmotion plans, or a combination thereof for grabbing and moving objects.Methods, systems, and techniques discussed herein with respect torobotic arm movements may be performed by the motion planning module1129.

With reference to FIGS. 2E, 2F and 3A, methods related to the objectrecognition module 1121 that may be performed for image analysis areexplained. FIGS. 2E and 2F illustrate example image informationassociated with image analysis methods while FIG. 3A illustrates anexample robotic environment associated with image analysis methods.References herein related to image analysis by a computing system may beperformed according to or using spatial structure information that mayinclude depth information which describes respective depth value ofvarious locations relative a chosen point. The depth information may beused to identify objects or estimate how objects are spatially arranged.In some instances, the spatial structure information may include or maybe used to generate a point cloud that describes locations of one ormore surfaces of an object. Spatial structure information is merely oneform of possible image analysis and other forms known by one skilled inthe art may be used in accordance with the methods described herein.

In embodiments, the computing system 1100 may obtain image informationrepresenting an object in a camera field of view (e.g., 3210) of acamera (e.g., 1200/3200). In some instances, the object may be a firstobject (e.g., 3510) of one or more objects (e.g., 3510-3540) in thecamera field of view 3210 of a camera 1200/3200. The image information2600, 2700 may be generated by the camera (e.g., 1200/3200) when thegroup of objects 3000A/3000B/3000C/3000D is (or has been) in the camerafield of view 3210 and may describe one or more of the individualobjects. The object appearance describes the appearance of an object3000A/3000B/3000C/3000D from the viewpoint of the camera 1200/3200. Ifthere are multiple objects in the camera field of view, the camera maygenerate image information that represents the multiple objects or asingle object, as necessary. The image information may be generated bythe camera (e.g., 1200/3200) when the group of objects is (or has been)in the camera field of view, and may include, e.g., 2D image informationand/or 3D image information.

As an example, FIG. 2E depicts a first set of image information, or morespecifically, 2D image information 2600, which, as stated above, isgenerated by the camera 3200 and represents the objects3000A/3000B/3000C/3000D/3550 of FIG. 3A. More specifically, the 2D imageinformation 2600 may be a grayscale or color image and may describe anappearance of the objects 3000A/3000B/3000C/3000D/3550 from a viewpointof the camera 3200. In an embodiment, the 2D image information 2600 maycorrespond to a single-color channel (e.g., red, green, or blue colorchannel) of a color image. If the camera 3200 is disposed above theobjects 3000A/3000B/3000C/3000D/3550, then the 2D image information 2600may represent an appearance of respective top surfaces of the objects3000A/3000B/3000C/3000D/3550. In the example of FIG. 2E, the 2D imageinformation 2600 may include respective portions2000A/2000B/2000C/2000D/2550, also referred to as image portions, thatrepresent respective surfaces of the objects3000A/3000B/3000C/3000D/3550. In FIG. 2E, each image portion2000A/2000B/2000C/2000D/2550 of the 2D image information 2600 may be animage region, or more specifically a pixel region (if the image isformed by pixels). Each pixel in the pixel region of the 2D imageinformation 2600 may be characterized as having a position that isdescribed by a set of coordinates [U, V] and may have values that arerelative to a camera coordinate system, or some other coordinate system,as shown in FIGS. 2E and 2F. Each of the pixels may also have anintensity value, such as a value between 0 and 255 or 0 and 1023. Infurther embodiments, each of the pixels may include any additionalinformation associated with pixels in various formats (e.g., hue,saturation, intensity, CMYK, RGB, etc.)

As stated above, the image information may in some embodiments be all ora portion of an image, such as the 2D image information 2600. Inexamples, the computing system 3100 may be configured to extract animage portion 2000A from the 2D image information 2600 to obtain onlythe image information associated with a corresponding object 3000A. Forinstance, the computing system 3100 may extract the image portion 2000Aby performing an image segmentation operation based on the 2D imageinformation 2600 and/or 3D image information 2700 illustrated in FIG.2F. In some implementations, the image segmentation operation mayinclude detecting image locations at which physical edges of objectsappear (e.g., edges of a box) in the 2D image information 2600 and usingsuch image locations to identify an image portion (e.g., 5610) that islimited to representing an individual object in a camera field of view(e.g., 3210).

FIG. 2F depicts an example in which the image information is 3D imageinformation 2700. More particularly, the 3D image information 2700 mayinclude, e.g., a depth map or a point cloud that indicates respectivedepth values of various locations on one or more surfaces (e.g., topsurface or other outer surface) of the objects3000A/3000B/3000C/3000D/3550. In some implementations, an imagesegmentation operation for extracting image information may involvedetecting image locations at which physical edges of objects appear(e.g., edges of a box) in the 3D image information 2700 and using suchimage locations to identify an image portion (e.g., 2730) that islimited to representing an individual object in a camera field of view(e.g., 3000A).

The respective depth values may be relative to the camera 3200 whichgenerates the 3D image information 2700 or may be relative to some otherreference point. In some embodiments, the 3D image information 2700 mayinclude a point cloud which includes respective coordinates for variouslocations on structures of objects in the camera field of view (e.g.,3210). In the example of FIG. 2F, the point cloud may include respectivesets of coordinates that describe the location of the respectivesurfaces of the objects 3000A/3000B/3000C/3000D/3550. The coordinatesmay be 3D coordinates, such as [X Y Z] coordinates, and may have valuesthat are relative to a camera coordinate system, or some othercoordinate system. For instance, the 3D image information 2700 mayinclude a first portion 2710, also referred to as an image portion, thatindicates respective depth values for a set of locations 2710 ₁-2710_(n), which are also referred to as physical locations on a surface ofthe object 3000D. Further, the 3D image information 2700 may furtherinclude a second, a third, and a fourth portion 2720, 2730, and 2740.These portions may then further indicate respective depth values for aset of locations, which may be represented by 2720 ₁-2720 _(n), 2730₁-2730 _(n), and 2740 ₁-2740 _(n), respectively. These figures aremerely examples, and any number of objects with corresponding imageportions may be used. Similarly to as stated above, the 3D imageinformation 2700 obtained may in some instances be a portion of a firstset of 3D image information 2700 generated by the camera. In the exampleof FIG. 2E, if the 3D image information 2700 obtained represents a firstobject 3000A of FIG. 3A, then the 3D image information 2700 may benarrowed as to refer to only the image portion 2710.

In an embodiment, an image normalization operation may be performed bythe computing system 1100 as part of obtaining the image information.The image normalization operation may involve transforming an image oran image portion generated by the camera 3200, so as to generate atransformed image or transformed image portion. For example, if theimage information, which may include the 2D image information 2600, the3D image information 2700, or a combination of the two, obtained mayundergo an image normalization operation to attempt to cause the imageinformation to be altered in viewpoint, object pose, lighting conditionassociated with the visual description information. Such normalizationsmay be performed to facilitate a more accurate comparison between theimage information and model (e.g., template) information. The viewpointmay refer to a pose of an object relative to the camera 3200, and/or anangle at which the camera 3200 is viewing the object when the camera3200 generates an image representing the object.

For example, the image information may be generated during an objectrecognition operation in which a target object is in the camera field ofview 3210. The camera 3200 may generate image information thatrepresents the target object when the target object has a specific poserelative to the camera. For instance, the target object may have a posewhich causes its top surface to be perpendicular to an optical axis ofthe camera 3200. In such an example, the image information generated bythe camera 3200 may represent a specific viewpoint, such as a top viewof the target object. In some instances, when the camera 3200 isgenerating the image information during the object recognitionoperation, the image information may be generated with a particularlighting condition, such as a lighting intensity. In such instances, theimage information may represent a particular lighting intensity,lighting color, or other lighting condition.

In an embodiment, the image normalization operation may involveadjusting an image or an image portion of a scene generated by thecamera, so as to cause the image or image portion to better match aviewpoint and/or lighting condition associated with information of amodel template. The adjustment may involve transforming the image orimage portion to generate a transformed image which matches at least oneof an object pose or a lighting condition associated with the visualdescription information of the model template.

The viewpoint adjustment may involve processing, warping, and/orshifting of the image of the scene so that the image represents the sameviewpoint as the visual description information in the model template.Processing, for example, includes altering the color, contrast, orlighting of the image, warping of the scene may include changing thesize, dimensions, or proportions of the image, and shifting of the imagemay include changing the position, orientation, or rotation of theimage. In an example embodiment, processing, warping, and or/shiftingmay be used to alter an object in the image of the scene to have anorientation and/or a size which matches or better corresponds to thevisual description information of the model template. If the modeltemplate describes a head-on view (e.g., top view) of some object, theimage of the scene may be warped so as to also represent a head-on viewof an object in the scene.

In various embodiments, the terms “computer-readable instructions” and“computer-readable program instructions” are used to describe softwareinstructions or computer code configured to carry out various tasks andoperations. In various embodiments, the term “module” refers broadly toa collection of software instructions or code configured to cause theprocessing circuit 1110 to perform one or more functional tasks. Themodules and computer-readable instructions may be described asperforming various operations or tasks when a processing circuit orother hardware component is executing the modules or computer-readableinstructions.

One aspect of the present disclosure relates to a robotic system or anyother computing system which is able to perform object detection (alsoreferred to as object recognition), object size measurement, and/orminimum viable region detection. The object detection or sizemeasurement may involve determining dimensions of an individual objectin a scene or determining a boundary of the individual object. Theobject may be part of a group of objects, such as a box that is part ofa group of objects. For instance, the robotic system may perform theobject detection operation as part of a de-palletization operation inwhich the robotic system receives camera data which captures a scenehaving a pallet of objects, in which each layer of the pallet hasobjects placed close to each other.

Object detection in this scenario may involve processing or analyzingthe camera data (image information) to distinguish among individualobjects on a particular layer of the pallet, so as to be able todistinguish one individual object from other objects on the layer. Theproblem occurs when the pallet of objects are all positioned flush withone another, making it difficult for the robotic system to detect orsperate the objects from one another. The process described herein mayallow the robotic system to identify a size and/or boundary of theobject, so as to generate a plan for picking up the individual objectfrom the pallet and moving it elsewhere. In some implementations, therobotic system may identify at least one minimum viable region for theobjects in a scene. The minimum viable region is an estimation of apotential outline, or dimensions, of the particular object. Therefore,the robotic system may grab an individual object at the minimum viableregion with the end effector apparatus (e.g. gripper) without knowingthe exact dimensions of the object. The minimum viable region of anobject represents a region on the top surface of the object which isestimated to exist entirely on the surface of a single object.Accordingly, attempting to grasp an object within this region ensuresthat the end effector apparatus contacts only a single object withoutextending over the edge of the object. Minimum viable regiondeterminations, as discussed herein, may increase the accuracy and speedof an object moving robot arm.

In the above example, the object may be, e.g., a box or other objectplaced next to other objects on a pallet or other platform. If a cameragenerates an image which captures a scene having the multiple objects,the image itself may not be completely reliable for purposes ofdistinguishing between the different objects. For instance, while somecameras may be able to generate an image that indicates depth values ofvarious locations in a scene, if the multiple objects have the sameheight or are otherwise the same distance from the camera, the image mayindicate substantially uniform depth values for a region covering thetop surfaces of the multiple objects, especially if the multiple objectsare closely packed together. Thus, such an image may provide limitedinformation for purposes of identifying individual objects from amongthe multiple objects. While some cameras may be able to generate animage that captures a visual appearance of the objects, such as theirtop surfaces, these top surfaces may have lines or other visual markingsprinted on them. Thus, such an image may include lines, but each of thelines could be associated with a boundary of one of the objects (e.g., afirst or second edge) or could be merely a visual marking (e.g., whichcould be a false edge). The systems and methods provide herein,therefore, may be used to determine a minimum viable region of an objectsurface. The minimum viable region represents a region on the topsurface of the object which is estimated to exist entirely on thesurface of a single object. The minimum viable region may be bordered byactual physical edges of an object and/or by false edges of the object.If the system is unable to easily distinguish between a false edge and aphysical edge, this inability may be accounted for through the methodsof defining a minimum viable region, as discussed below.

FIGS. 3A-3H illustrate an example environment in which the process andmethods described herein may be performed. FIG. 3A depicts anenvironment having a system 3500 (which may be an embodiment of thesystem 1500/1500A/1500B/1500C of FIGS. 1A-1D) that includes thecomputing system 3100 (e.g., an embodiment of computing system 1100), arobot 3300, and a camera 3200. The camera 3200 may be an embodiment ofthe camera 1200 and may be configured to generate image informationwhich represents a scene in a camera field of view 3210 of the camera3200, or more specifically represents objects (such as boxes) in thecamera field of view 3210, such as objects 3000A, 3000B, 3000C, and3000D. In one example, each of the objects 3000A-3000D may be, e.g., acontainer such as a box or crate, while the object 3550 may be, e.g., apallet on which the containers are disposed.

In an embodiment, the system 3500 of FIG. 3A may include one or morelight sources, such as light source 3600. The light source 3600 may be,e.g., a light emitting diode (LED), a halogen lamp, or any other lightsource, and may be configured to emit visible light, infrared radiation,or any other form of light toward surfaces of the objects 3000A-3000D.In some implementations, the computing system 3100 may be configured tocommunicate with the light source 3600 to control when the light source3600 is activated. In other implementations, the light source 3600 mayoperate independently of the computing system 3100.

In an embodiment, the system 3500 may include a camera 3200 or multiplecameras 3200, including a 2D camera that is configured to generate 2Dimage information 2600 and a 3D camera that is configured to generate 3Dimage information 2700. The 2D image information 2600 (e.g., a colorimage or a grayscale image) may describe an appearance of one or moreobjects, such as the objects 3000A/3000B/3000C/3000D, in the camerafield of view 3210. For instance, the 2D image information 2600 maycapture or otherwise represent visual detail disposed on respectiveouter surfaces (e.g., top surfaces) of the objects3000A/3000B/3000C/3000D, and/or contours of those outer surfaces. In anembodiment, the 3D image information 2700 may describe a structure ofone or more of the objects 3000A/3000B/3000C/3000D/3550, wherein thestructure for an object may also be referred to as an object structureor physical structure for the object. For example, the 3D imageinformation 2700 may include a depth map, or more generally includedepth information, which may describe respective depth values of variouslocations in the camera field of view 3210 relative to the camera 3200or relative to some other reference point. The locations correspondingto the respective depth values may be locations (also referred to asphysical locations) on various surfaces in the camera field of view3210, such as locations on respective top surfaces of the objects3000A/3000B/3000C/3000D/3550. In some instances, the 3D imageinformation 2700 may include a point cloud, which may include aplurality of 3D coordinates that describe various locations on one ormore outer surfaces of the objects 3000A/3000B/3000C/3000D/3550, or ofsome other objects in the camera field of view 3210. The point cloud isshown in FIG. 2F

In the example of FIG. 3A, the robot 3300 (which may be an embodiment ofthe robot 1300) may include a robot arm 3320 having one end attached toa robot base 3310 and having another end that is attached to or isformed by an end effector apparatus 3330, such as a robot gripper. Therobot base 3310 may be used for mounting the robot arm 3320, while therobot arm 3320, or more specifically the end effector apparatus 3330,may be used to interact with one or more objects in an environment ofthe robot 3300. The interaction (also referred to as robot interaction)may include, e.g., gripping or otherwise picking up at least one of theobjects 3000A-3000D. For example, the robot interaction may be part of ade-palletization operation in which the robot 3300 is used to pick upthe objects 3000A-3000D from the pallet and move the objects 3000A-3000Dto a destination location. The end effector apparatus 3330 may havesuction cups or other components for grasping or grabbing the object.The end effector apparatus 3330 may be configured, using a suction cupor other grasping component, to grasp or grab an object through contactwith a single face or surface of the object, for example, via a topface.

The robot 3300 may further include additional sensors configured toobtain information used to implement the tasks, such as for manipulatingthe structural members and/or for transporting the robotic units. Thesensors can include devices configured to detect or measure one or morephysical properties of the robot 3300 (e.g., a state, a condition,and/or a location of one or more structural members/joints thereof)and/or of a surrounding environment. Some examples of the sensors caninclude accelerometers, gyroscopes, force sensors, strain gauges,tactile sensors, torque sensors, position encoders, etc.

FIG. 3B depicts a top view of the objects 3000A, 3000B, 3000C, and 3000Dof FIG. 3A. Each object 3000A-3000D includes a surface 3001, a pluralityof corners 3002, and, in this example, an open corner 3004. As usedherein, “open corner” refers to any corner of the plurality of corners3002 that is not adjacent to another object 3000. An open corner may beformed by the two edges of the top surface of an object that do notthemselves border another object. In an example, the two edges mayinclude two horizontal edges. It is not required that the entirety ofthe two edges be free of adjacent objects for a corner to be consideredopen. Referring to FIG. 3C, which depicts a close-up top view of objects3000A-3000D, additional features of each object 3000A-3000D aredepicted. In a two-dimensional depiction, the object 3000A may have asurface defined by four edges, a lengthwise physical edge 3013, awidthwise physical edge 3015, a widthwise physical edge 3017, and alengthwise physical edge 3019. In the example embodiment shown, thewidthwise physical edge 3015 and the lengthwise physical edge 3013 arenot adjacent to or flush with edges of any adjacent objects and thus maybe referred to as open edges. In the example embodiment shown, thewidthwise physical edge 3017 and the lengthwise physical edge 3019 areadjacent to and flush with, the edges of the adjacent objects 3000B and3000D and thus may be referred to as closed edges. The use of the terms“lengthwise” and “widthwise” to refer to the edges described above doesnot imply that a specific orientation of the objects and/or edges isrequired. In general, for approximately rectangular objects, lengthwiseedges are adjacent to (and approximately perpendicular to) widthwiseedges. Lengthwise and widthwise, as used herein, refer only to specificdirections within the context of an object or group of objects, and notto specific absolute directions. The positioning of the objects 3000A,3000B, and 3000D may make it more difficult for the computing system3100 to differentiate the features of each object.

A method 4000 for determining and a minimum viable region for an opencorner 3004 is depicted in FIG. 4. The method 4000 may be stored on anon-transitory computer readable medium and may be executed by at leastone processing circuit, with the at least one processing circuit beingin communication with a camera having a field of view.

In an operation, the method 4000 includes an operation 4002 of obtainingimage information representing the physical characteristics of one ormore objects 3000, for example objects 3000A-3000D. The imageinformation is generated by the camera 3200 and describes at least anobject appearance associated with one or more objects, with each objectincluding a plurality of edges. FIG. 3A depicts obtaining or generatingthe image information. In embodiments, the image information is obtainedin a three-dimensional or perspective view, and, to increase theaccuracy of the method, the viewpoint is shifted to a two-dimensionalview. The viewpoint adjustment may involve processing, warping, and/orshifting of the image information. Processing may include altering thecolor, contrast, or lighting of the image information, warping of thescene may include changing the size, dimensions, or proportions of theimage information, and shifting of the image information may includechanging the position, orientation, or rotation of the imageinformation. Warping may involve determining a homography which definesa warping transformation that transforms the image information fromdepicting an object in three-dimensions to depicting the object in twodimensions, for example a top view. In some instances, the warping maydescribe a rotation and/or a translation that matches the imageinformation with corresponding points of a desired two-dimensional view,for example corners.

In embodiments, the computations and methods described below may becarried out after the camera is no longer imaging the object or objects,or after the object or objects have left the field of view.

In an operation, the method 4000 includes an operation 4004 fordetecting a plurality of corners 3002 of the plurality of objects 3000based on image information. To detect any corners present, the computingsystem 1100/3100 may use a variety of methods. For example, cornerdetection may involve edge detection and subsequent determination ofedge intersection. Edge detection may be performed based on analysis of2D and 3D image information, e.g., point cloud information. Edgedetection may include, for example, (i) 2D image analysis to identifylines or edges within the 2D image that may represent boundaries betweenobjects, (ii) point cloud analysis involving layer segmentation anddetection of different heights/depths to detect edges, or (iii) 2D or 3Dimage analysis to identify open edges. The examples described herein areby way of example only and edge detection may further includealternative techniques as appropriate.

Edge detection may include, for example, 2D image analysis to identifylines or edges within the 2D image that may represent boundaries betweenobjects. Such analysis may identify visual discontinuities within the 2Dimage that may represent edges. For example, such analysis may include,for example, analysis of pixel intensity discontinuity conditions orspiked pixel intensity conditions. Satisfying a defined pixel intensitydiscontinuity condition may include using changes in pixel intensityvalues, or more specifically, a derivative or gradient in pixelintensity values between regions having varying pixel intensities. Thegradient or derivative may then be used to detect a spike in pixelintensity that is present at an edge or corner, particularly when movingperpendicular to the edge or corner. Additionally, the computing system1100/3100 may apply a binary threshold to identify differences in pixelintensity, so as to define a spike or discontinuity between adjacentpixels, identifying an edges and corner.

Edge detection may include, for example, point cloud analysis involvinglayer segmentation and detection of different heights/depths to detectedges. Adjacent objects may have differing heights. Accordingly, thedetection of different heights (or depths) in a point cloud (3D imageinformation) may be used to detect edges between objects. Accordingly,the computing system 1100/3100 may detect edges according to portions ofa point cloud that satisfy a defined depth discontinuity condition.

In further examples, edge detection may be performed by detectingphysical edges that lack adjacent objects. Where objects lack adjacentobjects, for example, where objects are located on an outside perimeterof a group of objects, the edges of the group of objects may be detectedas physical edges of associated individual objects.

In embodiments, any of the above discussed methods for edge detectionmay be combined with each or with other edge detection methods toincrease the accuracy or reliability of edge detection.

In an operation, the method 4000 includes an operation 4005 identifyingan open corner 3004 from the plurality of corners 3002. As discussedabove, “open corner” refers to a corner of the plurality of corners 3002that is not adjacent to another object 3000. In embodiments, the systemmay be configured to identify the plurality of corners, and once thecorners 3002 are identified, to choose the open corner, such as targetopen corner 3004A, from open corners among the corners 3002. Inembodiments, each of the plurality of corners 3002 may be an opencorner, because the corner detection operation 4004 may detect cornersbased on physical edges identified by a lack of adjacent objects.Further, when identifying the target open corner, the system mayrecognize that the open corner is not adjacent to another object. Thetarget open corner 3004A may be a single target open corner.

Open corners (also referred to as convex or exterior corners), may beidentified through analysis of 3D image information, for example in theform of the point cloud, as discussed above. Open corners may beidentified, for example, by identifying vertices within the point cloud(e.g., based on edge intersection or other means) and then subjectingthe identified vertices to one or more criteria (e.g., length, depth,width, orthogonality criteria) to determine whether they represent opencorners. Further details regarding the use of image information toidentify corners and potential edges may be found in U.S. Pat. No.10,614,340, issued Apr. 7, 2020, which is incorporated by reference inits entirety.

As noted above and referring to FIG. 3C, which depicts an expanded topview of objects 3000A-3000D, additional features of each object3000A-3000D are depicted. In a two-dimensional depiction, the object3000A may have a surface defined by four edges, the lengthwise physicaledge 3013, the widthwise physical edge 3015, the widthwise physical edge3017, and the lengthwise physical edge 3019. In the example embodimentshown, the widthwise physical edge 3017 and the lengthwise physical edge3019 are adjacent to and flush with, the edges of the adjacent objects3000B and 3000D. As stated previously, such positioning of the objects3000A, 3000B, and 3000D may make it more difficult for the computingsystem 3100 to differentiate the features of each object.

In an operation, the method 4000 includes an operation 4006 defining aminimum viable region 3006, as illustrated in FIG. 3E and 3G, for thetarget open corner 3004A. The minimum viable region 3006 represents aregion on the surface 3001 of the object 3000 associated with the targetopen corner 3004A that may be grabbed by the end effector apparatus 3330of the robot arm 3320 to move the object 3000. As discussed above, therobot arm 3320 may use, for example, a suction grip of the end effectorapparatus 3330 to grasp an object, such as the target object thatincludes the target open corner 3004A, by the top surface. The minimumviable region 3006 of an object represents a region on the top surfaceof the object which is estimated to exist entirely on the surface of asingle object. Accordingly, attempting to grasp the target object withinthe region defined by the minimum viable region 3006 may ensures thatthe end effector apparatus 3330 contacts only the target object (e.g., asingle object) without extending over the edge of the target object suchthat adjacent objects are not also grasped when grasping the targetobject. In situations where edges of the object 3000 are adjacent to orin contact with other objects, the computing system 1100/3100 may havedifficulty identifying or take longer to identify exact dimensions ofthe object. Therefore, it may be difficult to accurately define a regionwhere the robot arm 3320 may securely grab the object 3000 withoutextending over an edge of the object 3000 and/or contacting another,separate, object. Thus, the minimum viable region 3006 defines a regionbased on estimated or potential dimensions where the robot arm 3320 maygrab the object 3000 without knowing exact dimensions of the object3000. The minimum viable region 3006 is intended to define a region inwhich the robot arm 3320 may grab an object 3000. It is understood thatthe size of the minimum viable region 3006 may be different, sometimessignificantly different, than the size of the object 3000 on which it isfound. Operations 4008-4016 depict how the minimum viable region 3006may be calculated and validated.

In an operation, the method 4000 includes an operation 4008 forgenerating a plurality of candidate edge segments. The plurality ofcandidate edge segments may include a plurality of widthwise candidateedge segments and a plurality of lengthwise candidate edge segments. Theplurality of widthwise candidate edge segments and plurality oflengthwise candidate edge segments represent edges or portions of edgesthat are candidates for corresponding to the widthwise physical edge3017 and the lengthwise physical edge 3019, respectively, of the object3000 associated with the target open corner 3004A. The object 3000associated with the target open corner 3004A may be positioned adjacentor flush with other objects 3000, creating a circumstance where thecomputing system 1100/3100 may be challenged in differentiating wherethe object 3000A ends and other objects (3000B, 3000C, 300D, etc.)begin.

The computing system 1100/3100 may first identify a plurality ofpotential edge segments within the 2D image information. The potentialedge segments may be identified from any type of detectable visualmarking having properties that make it a possibility for beingrecognized as representing a physical edge of the object. For example,visual markings may include edges, creases, gaps, changes in coloration,and other discontinuities. The potential edge segments may then befurther processed, for example, through clustering techniques, toidentify a plurality of candidate edge segments. Appropriate clusteringtechniques are described, for example, in U.S. patent application Ser.No. 16/791,024, filed Feb. 14, 2020, which is incorporated herein byreference in its entirety. In an embodiment, candidate edge segments maybe identified from potential edge segments based on being substantiallyperpendicular (e.g., within 5 degrees of perpendicular) to either thewidthwise physical edge 3017 (for lengthwise candidate edge segments) orthe lengthwise physical edge 3013 (for widthwise candidate edgesegments). FIG. 3C illustrates an individual widthwise candidate edgesegment 3009 of the plurality of widthwise candidate edge segments andan individual lengthwise candidate edge segment 3011 of the plurality oflengthwise candidate edge segments for example purposes.

In embodiments, the detection of the plurality of candidate edgesegments may be limited according to minimum and maximum candidatesizes. The computing system 1100/3100 may determine minimum and maximumcandidate sizes based on expected object sizes. Expected objects mayhave length, width, and height dimensions. A minimum candidate size maybe defined according to the smallest face (e.g., according to two of thethree dimension) to be found in the expected objects. In someembodiments, a minimum candidate size may be defined by length and widthdimensions and/or by a diagonal dimension. A maximum candidate size maybe defined according to the largest face (e.g. according to two of thethree dimensions) to be found in the expected objects. A minimumcandidate size 3016 and a maximum candidate size 3018 are illustrated,by way of example, in FIG. 3D. A minimum candidate size 3016 may thus beassociated with the smallest possible object face and a maximumcandidate size 3018 may thus be associated with a largest possibleobject face. The minimum candidate size 3016 may include a widthwisedimension 3016A and lengthwise dimension 3016B which represent thedimensions of the smallest possible object face, while the maximumcandidate size 3018 may include a widthwise dimension 3018A and alengthwise dimension 3018B which represent the dimensions of the largestpossible object face. In some embodiments, only the region between theminimum candidate size 3016 and the maximum candidate size 3018 isanalyzed for the generation of potential edge segments.

In embodiments, the operation 4008 may operate to combine alignedcandidate edge segments. For example, where one or more candidate edgesegments are aligned, they may be combined by the computing system1100/3100 for further analysis. Aligned candidate edge segments may havesubstantial co-linearity (also referred to as substantially similaralignment), which may be defined according to a predefined anglethreshold and/or a predetermined offset threshold. The angle thresholdfor two candidate edge segments may require, e.g., that an angle betweenthe two candidate edges be within the angle threshold (e.g., a certainnumber of degrees, such as 5°, 4°, 3°, 2°, or 1°), or that therespective angles formed by each of the two candidate edge segments bewithin the angle threshold. The offset threshold for two candidate edgesegments may require, e.g., that the candidate edges have a smalleroffset than an offset threshold. In an embodiment, an offset between twocandidate edges may be defined by a shortest distance between respectivelines extending or otherwise extrapolated from the candidate edges.Aligned candidate edge segments may be combined to create largercandidate edge segments.

In an operation, the method 4000 includes an operation 4010 ofdetermining a plurality of candidate edges from a plurality of candidateedge segments. Referring now to FIGS. 3C and 3D, the operation ofselecting candidate edges from the plurality of candidate edge segmentsis described. A series of thresholds or filters may be applied toeliminate candidate edge segments that are less likely, unlikely, orincapable of representing physical edges of the object under analysis.The operation 4010 may determine a plurality of candidate edges,including a plurality of widthwise candidate edges (represented in FIG.3C by the individual widthwise candidate edge 3008) and a plurality oflengthwise candidate edges (represented in FIG. 3C by the individuallengthwise candidate edge 3010) as estimations of the widthwise physicaledge 3017 and the lengthwise physical edge 3019. The operation 4010 mayselect candidate edges according to the application of one or morecriteria.

A first threshold or criterion may be a position criterion or a physicaledge to segment threshold. The position criterion represents anevaluation of whether or not a candidate edge segment falls within athreshold distance of a known open physical edge, and more specifically,a known open physical edge oriented substantially perpendicular to thecandidate edge segment. FIG. 3C illustrates the widthwise candidate edgesegment 3009, having a proximal endpoint 3009A and a distal endpoint3009B, and positioned substantially perpendicular to the lengthwisephysical edge vector 3012 corresponding to the lengthwise physical edge3013, which is one of the open physical edges. The proximal end point3009A is positioned at a proximal end of the widthwise candidate edgesegment 3009 with respect to the lengthwise physical edge 3013 and thedistal end point 3009B is positioned at a distal end of the widthwisecandidate edge segment 3009 with respect to the lengthwise physical edge3013. The position criterion represents an evaluation of whether thesegment to edge distance 3030A between the proximal end point 3009A andthe lengthwise physical edge 3013 is within a defined minimum value. Thedefined minimum value may be set as the length of the minimum dimension(Min) of the minimum candidate size 3016 weighted by a scaling factorδ₁. Accordingly, the position criterion may be expressed as 0≤distance3030A≤δ₁*Min. The scaling factor δ₁ may be set as a value between 0.4and 0.6 or between 0.4 and 0.5. The position criterion assures that theproximal end point 3009A of the widthwise candidate edge segment 3009 isnot spaced from the lengthwise physical edge 3013 by more than half thelength of a minimum dimension (Min) of the minimum candidate size 3016.Any candidate edge segments that do not satisfy the position criteriamay eliminated as possible members of the plurality of candidate edges.The position criteria may also be applied to the plurality of lengthwisecandidate edge segments, such as lengthwise candidate edge segment 3011having a proximal end point 3011A and a distal end point 3011B. Theproximal end point 3011A may be evaluated according to the positioncriterion for proximity to the widthwise physical edge 3015 based on thesegment to edge distance 3032A.

In further embodiments, the distal end point 3009B of the widthwisecandidate edge segment 3009 may be evaluated according to the positioncriterion for proximity to the lengthwise dimension 3018B of the maximumcandidate size 3018 based on the segment to edge distance 3030B. Thedistal end point 3011B of the lengthwise candidate edge segment may beevaluated according to the position criterion for proximity to thewidthwise dimension 3018A of the maximum candidate size 3018 based onthe segment to edge distance 3032B. The position criterion as applied tothe distal end points 3009B and 3011B may be used instead of or inaddition to the position criterion as applied to the proximal end point3009A and 3011A.

The position criterion expects that if a widthwise candidate edgesegment 3009 or a lengthwise candidate edge segment 3011 corresponds toeither the widthwise physical edge 3017 or the lengthwise physical edge3019, either the proximal endpoints 3009A/3009B or the distal endpoints3011A/3011B would be positioned within a threshold distance of thephysical edges or the maximum candidate size of the object. Accordingly,the position criterion evaluates whether a potential candidate edgesegment has an endpoint near to a known physical edge or an expectedphysical edge (as represented by the maximum candidate size) of theobject being analyzed. The scaling factor δ₁ may be selected to accountfor or address noise, sensor discrepancies, or other sources of error inidentifying edge segments.

The second criterion, also referred to as a segment length criterion ora segment length threshold, evaluates whether the length of a candidateedge segment exceeds a threshold. In embodiments, the threshold may beset as the length of a minimum dimension (Min) of the minimum candidatesize 3016 weighted by a scaling factor δ₂. Accordingly, the lengthcriterion may compare an edge segment length 3051 between the proximalend point 3009A/3011A and the distal endpoint 3009B/3011B to the minimumcandidate size 3016. If the edge segment length 3051 of a widthwisecandidate edge segment 3009 or a lengthwise candidate edge segment 3011is smaller than a percentage of the length of the minimum dimension ofthe minimum candidate size 3016, the computing system 1100/3100 mayeliminate the candidate edge segment 3009/3011 from consideration as acandidate edge. The length criterion may also be written as δ₂*Min≤edgeSegment Length 3051. The scaling factor δ₂ may have a value in a rangebetween 0.6 and 0.8, between 0.65 and 0.75, between 0.69 and 0.71, orapproximately 0.7.

The second criterion expects that, for the candidate edge segment tocorrespond with a physical edge and therefore be considered as acandidate edge, the candidate edge segment should be long enough toexceed a portion of a minimum dimension of the minimum candidate size.Thus, the candidate edge segments that do not meet the segment lengththreshold may not be considered as candidates that potentially representphysical edges. The scaling factor δ₂ may be selected to account for oraddress noise, sensor discrepancies, or other sources of error inidentifying edge segments.

The third criterion, also referred to as an orthogonality criterion or asegment orthogonality threshold, evaluates whether the candidate linesegments are substantially perpendicular to either the lengthwisephysical edge 3013 or the widthwise physical edge 3015. As used herein,the term substantially perpendicular means within 5 degrees of exactlyperpendicular. For example, the widthwise candidate edge segment 3009 iscompared to the lengthwise physical edge 3013 to determine substantialperpendicularity or substantial orthogonality and the lengthwisecandidate edge segment 3011 is compared to the widthwise physical edge3015 to determine substantial perpendicularity or substantialorthogonality. The third criterion expects that for widthwise candidateedge segments 3009 or lengthwise candidate edge segments 3011 tocorrespond with either the widthwise physical edge 3017 or thelengthwise physical edge 3015 respectively, the potential edge should besubstantially perpendicular with the physical edge from which itextends. Candidate edge segments that do not satisfy the orthogonalitycriterion may be eliminated as potential candidate edges.

The candidate edges may be selected or determined from the plurality ofcandidate edge segments that satisfy each of the three-criteria: theposition criterion, the length criterion, and the orthogonalitycriterion. The plurality of candidate edges may include a plurality ofwidthwise candidate edges and a plurality of lengthwise candidate edges.FIG. 3C illustrates an individual widthwise candidate edge 3008 and anindividual lengthwise candidate edge 3010. The widthwise candidate edge3008 is aligned with the associated widthwise candidate edge segment3009 and extends substantially perpendicularly from the lengthwisephysical edge 3013. The lengthwise candidate edge 3010 is aligned withthe associated lengthwise candidate edge segment 3011 and extendssubstantially perpendicularly to the widthwise physical edge 3015.

In an operation, the method 4000 includes an operation 4012 forgenerating a plurality of intersection points 3024 between respectiveones of the plurality of widthwise candidate edges and the plurality oflengthwise candidate edges. An intersection point 3024 is defined as theposition where one of any of the plurality of widthwise candidate edgesor projections thereof intersect with one of any of the plurality oflengthwise candidate edges or projections thereof. Projections may beused in situations where an identified candidate edge does not extendfar enough to intersect with one of the perpendicularly orientedcandidate edges. An individual intersection point 3024 between thewidthwise candidate edge segment 3009 and the lengthwise candidate edgesegment 3011 is depicted in FIG. 3C. As depicted in FIG. 3E, eachwidthwise candidate edge 3008A, 3008B, 3008C and lengthwise candidateedge 3010A, 3010B, 3010C may be associated with a plurality ofintersection points 3024A-3024I. For example, widthwise candidate edge3008A intersects with lengthwise candidate edges 3010A, 3010B, and 3010Cto create three distinct intersection points 3024A, 3024D, and 3024G.Each of these intersection points 3024A-3024I represent a potentialcorner of the target object opposing the target open corner 3004A.

In an operation, the method 4000 includes an operation 4014 ofgenerating a candidate minimum viable region that correlates with thetarget open corner 3004A. As stated above, the minimum viable region3006 (shown in FIG. 3C) represents a region on the surface 3001 of theobject 3000 associated with the target open corner 3004A that may be oris available to be grabbed or grasped by a robot arm 3320 so as to movethe object 3000. In applications where the object 3000 associated in thetarget open corner 3004A is adjacent to or in contact with otherobjects, the computing system 3100 may not accurately and/or preciselyestimate the dimensions of the object 3000, and, therefore, it isdifficult to accurately define a region where the robot arm 3320 maysecurely grab the object 3000. Therefore, the minimum viable region 3006defines a region based on estimated or potential dimensions where therobotic arm 3320 may grab the object 3000 without knowing exactdimensions. The minimum viable region 3006 is an area on the surface3001 of the object 3000 defined by the target open corner 3004A, awidthwise candidate edge 3008 of the plurality of widthwise candidateedge, a lengthwise candidate edge 3010 of the plurality of lengthwisecandidate edge, and an intersection point 3024.

As noted above, a plurality of widthwise candidate edges 3008A, 3008B,3008C, a plurality of lengthwise candidate edges 3010A, 3010B, 3010C,and a plurality of intersection points 3024A-3024I are identified in theprevious operations. Together, the plurality of widthwise candidateedges 3008A, 3008B, 3008C, the plurality of lengthwise candidate edges3010A, 3010B, 3010C, and the plurality of intersection points3024A-3024I may define a set of potential minimum viable regioncandidates 3066, individual ones of which (e.g., potential minimumviable region candidates 3006A-3006G) are illustrated in FIG. 3E. Inembodiments, a minimum viable region candidate associated with thetarget open corner 3004A may be identified according to an intersectionpoint 3024, as each intersection point 3024 further specifies thewidthwise candidate edge 3008 and the lengthwise candidate edge 3010that it is formed. As depicted in FIG. 3E, numerous potential minimumviable regions 3006 may be associated with a single widthwise candidateedge 3008A, 3008B, 3008C or a single lengthwise candidate edge 3010A,3010B, 3010C. Additionally, each of the set of potential minimum viableregion candidates 3066 is associated with a single intersection point3024. As discussed above, each of the set of potential minimum viableregion candidates 3066 fits within the minimum candidate size 3016 thatis associated with a smallest possible object and the maximum candidatesize 3018 that is associated with a largest possible object.

The potential minimum viable region candidates 3066 of FIG. 3E arepresented in Table 1 below as a combination of widthwise candidate edges3008A, 3008B, 3008C and lengthwise candidate edges 3010A, 3010B, 3010C.

TABLE 1 3008A 3008B 3008C 3010A 3006A 3006B 3006C 3010B 3006D 3006E3006F 3010C 3006G 3006H 3006I

The candidate minimum viable region 3067 may be chosen from the set ofpotential minimum viable region candidates 3066 based on the potentialminimum viable region candidate 3006A, 3006B, etc., having a smallestdiagonal distance between the target open corner 3004A and theassociated intersection point 3024. The smallest distance may be used asa primary factor in the determination of the candidate minimum viableregion 3067. Due to the use of the scaling factor δ₂ for the lengthcriterion, the candidate minimum viable region 3067 may be smaller thanthe minimum candidate size 3016. In further embodiments, the minimumcandidate size 3016 may be set as a minimum size for potential minimumviable region candidates 3066 to be selected as a candidate minimumviable region 3067

By selecting a smallest diagonal distance for the candidate minimumviable region 3067, the system determines that the candidate minimumviable region 3067 is not larger than the actual object 3000 that it islocated on. Although the identified edges that form the candidateminimum viable region 3067 may not represent the actual dimensions ofthe target object 3000A, they do represent possible edges of the targetobject 3000A. By selecting the smallest diagonal distance between thetarget open corner 3004A and the associated intersection point 3024, thecomputing system 1100/3100 determines an area, the candidate minimumviable region 3067, with an increased likelihood of existing on only thetarget object 3000A. In embodiments, the system may determine that thecandidate minimum viable region 3067 does not intersect more than oneobject. Thus, grasping the target object 3000A within the candidateminimum viable region 3067 increases the reliability of the graspingoperation by reducing the possibility that the robot arm attempts tograsp more than one object 3000 at a time.

Various other methods of determining a minimum viable region candidatefrom the set of potential minimum viable region candidates 3066 may beused. For example, in the event that a potential minimum viable regioncandidates includes an intersection point 3024 that is also a previouslyidentified open corner 3004, that potential minimum viable regioncandidate may be selected as it may be assumed to correlate with theentire surface 3001 of the object 3000. Alternatively, if a potentialminimum viable region candidate has substantially similar dimensions toeither the minimum candidate size 3016 or the maximum candidate size3018, for example within a certain percentage threshold, it may beassumed that the object 3000 is either the minimum candidate size 3016or the maximum candidate size 3018, respectively, and the potentialminimum viable region candidate may be selected. In some embodiments,the candidate minimum viable region 3067 to be chosen from the set ofpotential minimum viable region candidates 3066 may be based on thepotential minimum viable region candidate 3006A, 3006B, etc., having alargest area or a median area of the set of potential minimum viableregion candidates 3066. In some embodiments, the candidate minimumviable region 3067 to be chosen from the set of potential minimum viableregion candidates 3066 may be based on the potential minimum viableregion candidate 3006A, 3006B, etc., having an intersection point 3024associated with the shortest candidate edge.

In an operation, the method 4000 includes an operation 4016 ofvalidating or adjusting the minimum viable region candidate 3067 togenerate the minimum viable region 3006. Validation of the minimumviable region candidate 3067 may include one or more techniques, asdiscussed below. The validated and/or adjusted minimum viable regioncandidate 3067 may thus be defined as the minimum viable region 3006 andmay be used by the computing system 1100/3100 as a detection hypothesisor to augment a detection hypothesis for identifying objects. As usedherein, the term detection hypothesis refers to a hypothesis about thesize or shape of an object as determined by the computing system1100/3100. In embodiments, detection hypotheses may be confirmed viafurther analysis (e.g., using additional image analysis techniques),through robotic manipulation, and/or through additional means.

A minimum viable region candidate 3067 may be found for each open corner3004 identified in operation 4004 (according to the operations4006-4014), and the minimum viable region candidates 3067 so detectedmay be validated by comparison to other minimum viable region candidates3067. For example, the computing system 1100/3100 may perform overlapvalidation or occlusion validation. The computing system 1100/3100 maydetermine whether a portion of the minimum viable region candidate 3067of the target open corner 3004A intersects with a minimum viable regioncandidate 3067 associated with a different open corner 3004. In such acase, a comparison between the minimum viable region candidate 3067 ofthe target open corner 3004A and minimum viable region candidate 3067 ofthe different open corner 3004 may determine that the open corners3004/3004A belong to the same object 3000A (occlusion validation), orthat the open corners 3004/3004A belong to different objects 3000(overlap validation).

In the case of a shared object, the computing system 1100/3100 mayperform the occlusion validation. Two minimum viable region candidates3067 belonging to the same target object 3000A may occlude one another.The computing system 1100/3100 may combine the information of bothminimum viable region candidates 3067 to generate the minimum viableregion 3006 or may adjust the minimum viable region candidate 3067 ofthe target open corner 3004A to incorporate the information of theminimum viable region candidate 3067 of the different open corner 3004to create a more accurate minimum viable region 3006 for the object3000A.

Alternatively, in the event that the corners do not belong to the sameobject 3000, the computing system 1100/3100 may decrease confidencelevels associated with the two overlapping minimum viable regioncandidates 3067. In determining which of the minimum viable regioncandidates 3067 to designate as the minimum viable region 3006 forfurther processing, the computing system 1100/3100 may select a minimumviable region candidate 3067 having a highest confidence level (forexample, having fewer or no overlaps with other minimum viable regioncandidates 3067).

In embodiments, the accuracy of the minimum viable region candidates3067 may further be increased using additional factors. For example, inthe event that a pallet is known to contain a uniform type of objects(e.g. objects having a single SKU), the minimum viable regions for eachof the objects (and particularly, the minimum viable regions that matchwith the dimensions of the objects) may be expected to be substantiallyuniform. In such a case several different techniques may be employed toidentify and validate minimum viable region candidates 3067 from thepotential minimum viable region candidates 3066. Single SKU or uniformobject methods may include one or more of a template verificationoperation, a packing verification operation, and a corner classificationoperation.

Referring now to FIG. 3F, in embodiments, for an object repository(e.g., a pallet, container, or other object repository) containinguniform type objects, minimum viable region candidates 3067 may beidentified and/or validated from the potential minimum viable regioncandidates 3066 based on a template verification operation. Aspects ofthe object recognition methods performed herein are described in greaterdetail in U.S. application Ser. No. 16/991,510, filed Aug. 12, 2020, andU.S. application Ser. No. 16/991,466, filed Aug. 12, 2020, each of whichis incorporated herein by reference. The template verification operationproceeds based on the assumption that uniform objects will have similardimensions and visual characteristics. Portions of the image informationdefined by the potential minimum viable region candidates 3066 may beanalyzed to generate templates 3068 corresponding to each of thepotential minimum viable region candidates 3066. Each template 3068 mayinclude information generated from the associated image portion,including at least a textured value, a color value, and a dimensionvalue. The textured value may define whether the image portion of thepotential minimum viable region candidate 3066 identifies a textured ortextureless surface. The color value may define the color of the imageportion representing the potential minimum viable region candidate 3066.The dimension value may represent the edge dimensions and/or area of thepotential minimum viable region candidate 3066. The templates 3068 maybe compared with one another to identify templates 3068 that match inone or more of the textured value, the color value, and the dimensionvalue. Matching in one or more of these values may indicate that thepotential minimum viable region candidates 3066 associated with thematching templates 3068 represent true physical objects 3000. Where theobjects 3000 are of a uniform type, it may be expected that they havematching templates 3068. Thus, by identifying matching templates 3068,one or more minimum viable region candidates 3067 may be identified fromthe potential minimum viable region candidates 3066. In embodiments, theidentified one or more minimum viable region candidates 3067 may furtherbe validated as minimum viable regions 3006 based on the templateverification operation.

Referring now to FIGS. 3G and 3H, in embodiments, for an objectrepository 3090 (e.g., a pallet, container, or other object repository)containing uniform type objects 3091, minimum viable region candidates3067 (not shown here) may be identified and/or validated from thepotential minimum viable region candidates 3066 based on a packingverification operation. If it is known that the object repository 3090is fully packed (e.g., having a layer completely occupied by objects) orit is known that the object repository is packed around the edges, thisinformation may be used to assist or augment the identification andvalidation of minimum viable region candidates 3067 from the potentialminimum viable region candidates 3066 in several ways.

First, the packing verification operation may use the total area of thepacked object repository 3090 to assist in the identification andvalidation of minimum viable region candidates 3067. If the objectrepository 3090 is fully packed, the total surface area of the objects3091 located thereon will be evenly divisible by the surface area of asingle object 3091. In the example shown in FIG. 3G, the total surfacearea of the objects 3091 divided by the surface area of a single object3091 is eight. The areas of the potential minimum viable regioncandidates 3066 may be compared to the total surface area of the objects3091 to identify minimum viable region candidates 3067 with areas thatevenly divide into the total surface area. A threshold factor (e.g.,95%, 98%, 99%, etc.) may be applied to the division operation to accountfor noise and other sources of measurement error. The identified minimumviable region candidates 3067 may be further validated to determineminimum viable regions 3006 according to further methods describedherein.

Second, the packing verification operation may use the dimensions of thepacked object repository 3090 to assist in the identification andvalidation of minimum viable region candidates 3067. In a specificarrangement wherein the objects 3091 are arranged in equal numbers ofrows and columns, the dimensions of the potential minimum viable regioncandidates 3066 may be compared to the dimensions of object repository3090 to identify and/or validate minimum viable region candidates 3067.For example, the widthwise dimension X1 and the lengthwise dimension X2of the object repository 3090 will be evenly divisible by dimensions ofpotential minimum viable region candidates 3066 that match thedimensions of the objects 3091. A potential minimum viable regioncandidate 3066 is shown in FIG. 3G with a darkened border, having thedimensions D1 and D2. If X1=m*D1, where m is an integer (in the exampleof FIG. 3G, m=2) and X2=n*D2, where n is an integer (in the example ofFIG. 3G, n=4), it may indicate that the potential minimum viable regioncandidate 3066 represents the true dimensions of an object 3091 on theobject repository 3090 and may be identified as a minimum viable regioncandidate 3067. If the dimensions of a potential minimum viable regioncandidate 3066 do not satisfy these conditions, it is unlikely torepresent the true dimensions of an object 3091 on the object repository3090. In embodiments, a percentage threshold (e.g., 95%, 98%, 99%) maybe used in the equations for X1 and X2 to account for potential errorsin measurement due to noise and other factors. The identified minimumviable region candidates 3067 may be further validated to determineminimum viable regions 3006 according to further methods describedherein.

Third, the packing verification operation may use the dimensions of theobject repository 3090 to assist in the identification and validation ofminimum viable region candidates 3067 for more general arrangements. Ina general arrangement, shown in FIG. 3H, wherein the objects 3091 arearranged in unequal numbers of rows and columns (sometimes referred toas a pinwheel pattern) that completely pack the edges of the objectrepository 3090, the length of each side of the object repository 3090should be equal to an integer number of a widthwise dimension of theobjects 3091 plus an integer number of a lengthwise dimension of theobjects 3091. The general arrangement packing operation may be usedwhether the object repository 3090 is fully packed or not, as long asthe edges are fully packed. The dimensions of the potential minimumviable region candidates 3066 may be compared to the dimensions ofobject repository 3090 to identify and/or validate minimum viable regioncandidates 3067 using a pair of equations that accounts for both thewidth and length of the objects 3091. A potential minimum viable regioncandidate 3066 is shown in FIG. 3H with a darkened border, having thedimensions D1 and D2. In the general arrangement, the equations thatshould be satisfied are as follows. X1=m1*D1+n1*D2, where m1 and n1 areboth integers (in the example of FIG. 3H, m1=1 and n1=3) andX2=m2*D1+n2*D2, where m2 and n2 are both integers (in the example ofFIG. 3H, m2=1 and n2=3). If both equations can be satisfied by thedimensions of a potential minimum viable region candidate 3066, it mayindicate that the potential minimum viable region candidate 3066represents the true dimensions of an object 3091 on the objectrepository 3090 and may be identified as a minimum viable regioncandidate 3067. If the dimensions of a potential minimum viable regioncandidate 3066 do not satisfy these conditions, it is unlikely torepresent the true dimensions of an object 3091 on the object repository3090. In embodiments, a percentage threshold (e.g., 95%, 98%, 99%) maybe used in the equations for X1 and X2 to account for potential errorsin measurement due to noise and other factors. The identified minimumviable region candidates 3067 may be further validated to determineminimum viable regions 3006 according to further methods describedherein.

In embodiments, a corner classification operation may be carried out bythe computing system 1100/3100 on an object repository 3090 havingobjects 3091 of uniform type. Because the objects 3091 are uniform, itwill be expected that the objects 3091 identifiable at the open cornersmay have some features in common. For example, corner types of theobjects 3091 may be compared between corners of the object repository3090. Corner types may include, for example, rectangular, octagonal, androunded. In some situations, due to measurement error, it may bedifficult to distinguish between the different types of object cornersat the corners of the object repository 3090. For example, it may bedifficult to determine whether an object corner is octagonal or rounded.In such cases, the object corners at each corner of the objectrepository 3090 may be compared and, the object corner identified mostfrequently may be determined as the object corner type of the objects3091 in the object repository 3090. Similar techniques may be used todetermine a textured or textureless classification of the objects 3000located at the corners of an object repository 3090.

Method 5000, detailed below, describes a method for estimatingdimensions of a target object. In embodiments, the method 5000 may makeuse of a minimum viable region as defined in method 4000. Inembodiments, the method 5000 may estimate dimensions of a target objectonce the robot arm has begun interacting with the object. In particular,it is possible that the minimum viable region defined in method 4000 maybe the incorrect size due to noise, false edges, or other inaccuracies.Therefore, the robot arm may be instructed to grasp the object at aposition that is not secure, increasing the risk of damage to the objector the environment. For example, grasping an object at an off-centerposition may result in unacceptable levels of torque at the graspingpoint when lifting is attempted. To more accurately estimate objectdimensions, the robot arm may move the grasped object to expose gapsbetween the object and the adjacent objects, allowing a more accurateapproximation of the dimensions of the object to be determined. Thedefined minimum viable region and the newly identified target objectdimensions may then be compared to determine the accuracy of the definedminimum viable region, and, if the difference is significant, the robotarm may release and regrasp the object based on the defined dimensions.

The method 5000 for estimating dimensions of an object is illustrated inFIG. 5. The method 5000 is based on moving or repositioning a targetobject to enable a better view of the physical dimensions of the object.In the method 5000, a minimum viable region for a target open corner ofthe target object is first determined. A minimum viable region of atarget object represents an estimation of the target object dimensions.The minimum viable region may be estimated, for example, by the method4000, described above. The target object is then grasped based on theminimum viable region. To increase accuracy of the target objectdimension estimation, the method 5000 may include moving or dragging thetarget object relative to surrounding objects to expose a gap betweenthe objects. The gap may then be used to determine the dimensions of thetarget object and may be used by the computing system (for example byadjusting the minimum viable region, e.g., as generated in method 4000).The method 5000 may be stored on a non-transitory computer readablemedium and may be executed by at least one processing circuit, with theat least one processing circuit being in communication with a camerahaving a field of view. The target object estimated dimensions generatedby the method 5000 may be compared against the minimum viable regiondetermined, e.g., by method 4000, so as to ensure that the arm of therobot securely grasps the target object prior to lifting and moving theobject.

As discussed above, the although the minimum viable region calculationas performed by method 4000 ensures that the minimum viable regionexists on only a single object, due to the potential inclusion of falseedges or other sources of error, a center of the minimum viable regionmay not be near the center of the target object. Thus, grasping theobject in a center of the minimum viable region may result in anoff-centered and difficult to lift object. In the event that theadjusted minimum viable region of method does not sufficiently estimatethe target object dimensions, method 5000 may instruct the arm of therobot to regrasp the target object based on newly generated targetobject dimensions.

In embodiments, the method 5000 may be performed with, subsequent to,and/or in conjunction with the method 4000 for identifying minimumviable regions. In such embodiments, operations of the method 5000 maybe the same as the operations of the method 4000. For example,operations 4002 and 5002, operations 4004 and 5004, and operations 4005and 5005 may coincide and/or be the same operation. The operation 5006of the method 5000 may incorporate all or some of the operation 4006,including the operations 4008-4016.

The method 5000 includes an operation 5002 of obtaining initial imageinformation representing the physical characteristics of one or moreobjects 3000 of FIG. 3, for example objects 3000A-3000D. The initialimage information is generated by the camera 3200 and describes at leastan object appearance associated with one or more objects, with eachobject including a plurality of edges. The operation 5002 for obtaininginitial image information may include any or all of the methods andtechniques discussed above with respect to the operation 4002. Theinitial image information may be the same initial image informationgathered in method 4000, or it may be newly gathered initial imageinformation. For the sake of clarity, either source of initial imageinformation obtained in operation 5002 may be referred to as initialimage information. The method 5000 described herein may be executed bythe at least one processing circuit of the computing system.

In an operation, the method 5000 includes operation 5004 for detecting aplurality of corners 3002. To detect any corners present, the computingsystem 3100 may use a variety of methods, including any and all methodsand techniques discussed above with respect to operation 4004. Theoperation 5004 described herein may be completed by the at least oneprocessing circuit of the computing system.

In an operation, the method 5000 includes an operation 5005 foridentifying a target open corner 3004A from the plurality of corners3002. The operation 5005 may include any and all of the methods andtechniques described above with respect to operation 4005. The operation5005 described herein may be completed by the at least one processingcircuit of the computing system.

In an operation, the method 5000 includes an operation 5006 for defininga minimum viable region 3006 for the target open corner 3004A. Theoperation 5006 may include performing one or more portions of the method4000, in particular operations 4006 through 4016, to determine theminimum viable region 3006 of the target open corner 3004A. The minimumviable region 3006 represents a region on the surface 3001 of the object3000 associated with the target open corner 3004A that may be grabbed bya robot arm 3320 to move the object 3000.

The minimum viable region 3006 defined in operation 5006 may be thevalidated minimum viable region 3006 defined by method 4000 (e.g., byone or more of operations 4006 to 4016). In further embodiments, acandidate minimum viable region as determined by operation 4014 may beselected for continued use in method 5000. As noted above, defining theminimum viable region for the target open corner 3004A includes definingan intersection corner 3024, wherein the intersection corner opposes thetarget open corner 3004A. As used herein, the term “opposes” withrespect to the target open corner 3004A and the intersection corner 3024refers to the position of these corners at opposite corners of a surface(e.g., a top surface) of the target object, wherein the opposing cornersdo not have common edges extending therefrom. As discussed above, eachminimum viable region 3006 is defined by a target open corner 3004A, aportion of the lengthwise physical edge 3013, a portion of the widthwisephysical edge 3015, a widthwise candidate edge 3008, a lengthwisecandidate edge 3010, and an intersection point 3024 associated with thewidthwise candidate edge 3008 and the lengthwise candidate edge 3010. Asdepicted in FIG. 3E, numerous minimum viable regions 3006A, 3006B,3006C, etc., may be associated with a single widthwise candidate edge3008A, 3008B, 3008C or a single lengthwise candidate edge 3010A, 3010B,3010C. Additionally, each minimum viable region 3006 is only associatedwith a single intersection point 3024. The operation 5006 describedherein may be completed by the at least one processing circuit of thecomputing system.

Referring now to FIG. 6A, in an operation, the method 5000 includes anoperation 5008 for defining a non-occlusion area 3027 based on theminimum viable region 3006. During the execution of method 5000, therobotic arm 3320 may move in the space between the field of view 3210 ofthe camera 3200 and the object 3000 associated with the open corner3004A. During this movement, the robot arm 3320 may block or occludefeatures that are used for detection of the target object within thefield of view 3210 of the camera 3200. As discussed below, the method5000 involves the capture of subsequent image information, e.g., atoperation 5016. To accurately capture the subsequent image informationdescribing the dimensions and minimum viable region 3006 of the targetobject 3000A, it is desirable that specific portions of the object3000A, or features, remain non-occluded and positioned so as to beviewable by the camera 3200.

The non-occlusion area 3027 is a two dimensional region of the surfaceof the group of objects for which it is desirable to obtain images ofduring the supplemental image information gathering operation, discussedbelow. Between the non-occlusion area 3027 and the and the camera 3200is a non-occlusion zone, a three dimensional space wherein positioningof the robotic arm 3320 should be avoided during an imaging operation soas to not block or occlude the camera 3200 form obtaining supplementalimage information of the non-occlusion area 3027. Blocking thenon-occlusion area 3027, as referred to herein, refers to blocking thenon-occlusion area 3027 from observance by the camera 3200.

The features of the non-occlusion area 3027 may be defined according tothe minimum viable region 3006. Thus, one or more edges of the minimumviable region 3006 may serve as the basis to define the non-occlusionarea 3027. Features for inclusion in the non-occlusion area 3027 mayinclude the widthwise candidate edge 3008 extending from theintersection point 3024, and the lengthwise candidate edge 3010extending from the intersection point 3024 substantially perpendicularlyto the widthwise candidate edge 3008. Further portions of thenon-occlusion area 3027 may include extensions of the physical edges3013/3015 that define the minimum viable region 3006. The physical edgeextensions included in the non-occlusion area 3027 are the portions ofthe physical edges 3013/3015 of the object stack that extend beyondboundaries of the minimum viable region 3006. The physical edgeextensions are the portions of the physical edges 3013/3015 that extendbeyond the intersections of the physical edges 3013/3015 with thecandidate edges 3008/3010.

An example of the non-occlusion area 3027 is shown in FIG. 6A. Forillustrative purposes, the non-occlusion area 3027 is shown to includefour separate strips, 3028A-3028D. Each strip 3028 is a region runningparallel to and corresponding to a known physical edge or a candidateedge of the target object. For example, with respect to target object3000A, the strips 3028 run parallel to and correspond with the widthwisecandidate edge 3008, the lengthwise candidate edge 3010, the lengthwisephysical edge 3013, or the widthwise physical edge 3015. For example, asshown in FIG. 6A, strip 3028A corresponds with the lengthwise physicaledge 3013, strip 3028B corresponds with the widthwise candidate edge3008, strip 3028C corresponds with the lengthwise candidate edge 3010,and strip 3028D corresponds with the widthwise physical edge 3015. Eachstrip 3028 has a length and a width and is located such that at least aportion of the edge (physical or candidate) to which it corresponds isincluded within the area of the strip 3028.

The length of each strip 3028 may be a fixed distance or may be based onthe maximum candidate size. For example, referring to FIG. 6A, strip3028A is positioned to correspond with the widthwise candidate edge 3008and extends beyond the widthwise candidate edge 3008 until the strip3028A encounters the border of the maximum candidate size 3018. Thestrips 3028 may have various widths, such as 5 mm, 4 mm, 3 mm, selectedaccording to the application. The larger the width of the strip 3028,the greater chance that a gap 3026 may be detected when the object 3000Ais moved or dragged. Larger widths, however, may come at the cost of anincreased possibility of noise or signal errors. The smaller the area ofthe strip 3028 the more freedom of movement the robotic arm 3320 mayhave, because there are smaller areas for the robotic arm 3320 to avoid.In addition, smaller occlusion areas may reduce computational load. Theoperation 5008 described herein may be completed by the at least oneprocessing circuit of the computing system.

In an operation, the method 5000 includes an operation 5010 fortransmitting a positioning command for the positioning of the roboticarm 3320 of the robot 3300. The positioning command may be transmittedby the computing system 1100/3100 to cause the positioning of therobotic arm 3320 in a position to grasp the target object 3000A and movethe target object 3000A. The positioning command may cause the roboticarm 3320 to be positioned outside of the non-occlusion zone so therobotic arm 3320 and/or the end effector apparatus 3330 does not blockthe non-occlusion area 3027 from view by the camera. The positioningcommand may be transmitted by the at least one processing circuit of thecomputing system.

In an operation, the method 5000 includes an operation 5012 fortransmitting a minimum viable region (MVR) grasping command forgrabbing, picking up, grasping, etc., the target object 3000A from alocation within the minimum viable region 3006 of the target object3000A. For example the center or approximate center of the minimumviable region 3006 may be selected by the computing system 1100/3100 forgrasping the target object 3000A. As discussed above, the end effectorapparatus 3330 may employ a suction cup or other grasping tool capableof grasping or securing the object through contact via a surface of theobject. In embodiments, the grasping command may be configured to causethe robotic arm 3320 and the end effector apparatus 3330 to remainoutside of the non-occlusion zone during the grasping operation.

In an operation, the method 5000 includes an operation 5014 fortransmitting a movement command for moving the target object 3000A. Thetarget object 3000A may be moved, as discussed above, to open one ormore gaps between the target object 3000A and adjacent objects, therebypermitting the computing system 1100/3100 to more accurately estimatethe dimensions of the target object 3000A. The movement command mayinclude three aspects, a movement distance, a movement direction, and amovement type. The movement distance and the movement direction,discussed in greater detail below, may be determined according toinformation related to adjacent objects, as discussed below. Themovement type may include a lifting motion or a dragging motion.

In embodiments, generation of the movement command may include adetermination of a movement type for the movement command. Determinationof the movement type may include determining whether to cause therobotic arm 3320 to use a lifting motion or a grasping motion. To decidewhich movement type to use, the computing system 1100/3100 may comparethe minimum viable region 3006 (e.g., as obtained in operation 5006)against the maximum candidate size 3018. If the minimum viable region3006 is small when compared against the maximum candidate size 3018,there is an increased risk that the minimum viable region 3006determined in operation 5006 incorrectly estimates the dimensions of thetarget object 3000A, by representing only a small corner. If the roboticarm 3320 were to lift the target object 3000A from a minimum viableregion 3006 that represents only a small corner of the target object3000A, the target object 3000A or environment may be damaged due toincreased torque placed on the end effector apparatus 3330. In thealternative, if the minimum viable region 3006 is comparable to themaximum candidate size 3018, there is a higher confidence that theminimum viable region 3006 accurately portrays the target object 3000A.Accordingly, the region-candidate ratio between the minimum viableregion 3006 and the maximum candidate size 3018 may be used to determinewhether to select a lifting or grasping motion for the movement command.If the region-candidate ratio is greater than or equal to a certainthreshold, the computing system may select the lifting motion. Examplethreshold values include 50% or greater, 60% or great, 70% or greater,80% or greater, and 90% or greater. The computing system 1100/3100 mayselect a lifting motion when the region-candidate ration surpasses thethreshold because comparable relative sizing between the minimum viableregion 3006 and the maximum candidate size 3018 increases the confidencethat the robotic arm 3032 will be able to securely grasp the targetobject 3000A without a high risk of damage to the object 3000A or theenvironment due to increase torque. Selecting the lifting motion as themovement type in such a situation may further allow the computing system3100 to measure a size of the target object 3000A, as described below,and the weight of the target object 3000A.

If the region-candidate ratio is smaller than the threshold value, thedragging motion may be selected as the movement type for the movementcommand. The computing system 1100/3100 may select the dragging motionbecause, if the threshold is not surpassed, the certainty that theminimum viable region accurately represents the dimensions of the targetobject 3000A is lower, increasing the risk of damage to either thetarget object 3000A or the environment. In embodiments the movementcommand may be configured to cause the robotic arm 3320 to provide asmall lifting force while performing the dragging motion to reducefriction. Selecting the dragging motion may allow for the computingsystem 1100/3100 to measure the size of the target object 3000A asdescribed below, but not the weight of the target object 3000A.

The movement distance and the movement direction of the movement commandmay be determined to permit the target object 3000A to be moved relativeto the surrounding or adjacent objects 3000B-3000D to expose gaps 3026between the edges of the objects 3000. In embodiments, the movementdistance and the movement direction of the target object 3000A is basedon a movement distance and direction of the target open corner 3004A.The movement distance may be determined according to an amount ofmovement necessary to expose a gap. The movement direction may bedetermined based on a likelihood of exposing gaps and a likelihood ofavoiding collision with objects 3000 adjacent to the target object3000A.

Determining the movement distance may be determined according to anamount of movement necessary to expose a gap. To expose a gap 3026, thetarget object 3000A may select a movement distance sufficient to exposea gap 3026 that surpasses a gap width threshold (see below for furtherdetails). In embodiments, the movement distance may be selectedaccording to the size of the objects (e.g., larger objects may requirelarger movement distances to create identifiable gaps 3026). In someembodiments, in which gaps 3026 are viewed during execution of amovement command, there may be no preset movement distance. The movementdistance may be dynamically determined based on a determination that asufficient gap size (exceeding the gap width threshold) has beendetected.

The movement direction may be selected to move the target object 3000Aaway from the adjacent objects 3000 to expose the gaps 3026 whileavoiding potential collisions. In embodiments, the movement directionmay be selected as a combination of vectors, each of which represents adirection away from an adjacent object. Accordingly, the object 3000Amay be moved in a diagonal direction 3029. The diagonal direction 3029represents a combination of a horizontal vector in a direction oppositeto the widthwise physical edge vector 3014 and a vertical vector in adirection opposite to the lengthwise physical edge vector 3012. The gaps3026, once exposed, may be used by the computing system 1100/3100 toestimate dimensions of the target object 3000A, as explained in greaterdetail below. The operation 5014 described herein may be completed bythe at least one processing circuit of the computing system.

In embodiments, the computing system 1100/3100 may adjust the movementcommand during execution of the movement command. Adjustment of themovement command may take into account whether the dragging of theobject 3000A is obstructed by an obstacle, for example a heavy objectwhen determining the movement command. To avoid damage to either therobotic arm 3320 or the object 3000A, the system may include, forexample, force sensors that detect when a dragging or resistance forceexceeds a defined threshold. The defined threshold may be selectedaccording to a safety factor related to the robot 3300, the robotic arm3320, and/or the end effector apparatus 3330. The defined threshold mayalso be selected according to a safety factor related to the object3000A and/or a type of object to be moved. If the defined threshold isexceeded, the robotic arm 3320 may be commanded to attempt to performthe dragging in a different direction and distance from the originalmovement direction and movement distance and/or may be commanded toalter a location at which the object 3000A is grasped. In embodimentswhere a high amount of drag or resistance force has been detected, therobotic arm 3320 may remain grasping the object 3000A until the forcessubside, so as to reduce the risk of the object 3000A inadvertently andunexpectedly moving as a result of the forces. For example, in asituation where the target object 3000A is stacked on a plurality ofobjects, excessive drag or resistance force may cause instability amongthe objects 3000 positioned beneath the target object 3000A.

The processing circuit may be configured to control the robotic arm 3320of the robot 3300 so as to avoid blocking or occluding the non-occlusionarea 3027 during the operations 5010, 5012, and 5014 (discussed below)so that the camera 3200 may maintain viewability of the non-occlusionarea 3027. The computing system 3100 may generate the positioning,grasping, and movement commands for the end effector apparatus 3330and/or the robotic arm 3320 of the robot 3300 so as to avoid blockingthe non-occlusion area 3027, e.g. the strips 3028A-3028D, from the viewof the camera 3320. In embodiments, the positioning, grasping, andmovement commands may be configured to permit blocking of thenon-occlusion area during the positioning and grasping commands andduring at least a portion of the movement commands. For example, the endeffector apparatus 3330 of the robotic arm 3320 may be positioned forgrasping the target object 3000A and the target object 3000A may begrasped while at least a portion of the robotic arm 3320 is blocking thenon-occlusion area. After grasping the target object 3000A, the roboticarm 3320 may be further positioned so as to avoid blocking thenon-occlusion area 3027. In further embodiments, the movement commandmay begin while the robotic arm 3320 is blocking the non-occlusion area3027 and execution of the movement command causes further movement ofthe robotic arm 3320 to a position that avoids blocking thenon-occlusion area 3027. In further embodiments, positions of roboticarm 3320 portions may be adjusted during execution of the movementcommand to ensure that they are not blocking the non-occlusion area whenthe supplemental image information is captured at operation 5016.

In embodiments, there may exist situations where it is impossible or notfeasible to execute the positioning, grasping, and movement commandswithout the robotic arm 3320 of the robot 3300 partially or completelyblocking the non-occlusion area 3027 from the camera field of view 3210.In such a case, the computing system 1100/3100 generates movementcommands or instructions for the robotic arm 3320 of the robot 3300 thatallow for the robotic arm 3320 to partially block the non-occlusion area3027 as the computing system 1100/3100 may accurately estimate thedimensions of the target object 3000 as long as certain aspects of thenon-occlusion area 3027 remain unobstructed. In particular, theintersection point 3024, at least a portion of the widthwise candidateedge 3008, and at least a portion of the lengthwise candidate edge 3010positioned within the field of view 3210 of the camera 3200 may permitthe computing system 1100/3100 to estimate the target object 3000Adimensions. In a partially occluded situation, as discussed in greaterdetail below, the computing system 1100/3100 may be configured to inferor project the estimated object dimensions of the target object 3000Afrom the intersection point 3024, the portion of the widthwise candidateedge 3008, and the portion of the lengthwise candidate edge 3010 thatare within the field of view 3210 of the camera 3200. Accordingly, thecomputing system 1100/3100 may be configured to cause execution of thepositioning, grasping, and movement commands to permit partial occlusionthat nonetheless leaves the intersection point 3024, at least a portionof the widthwise candidate edge 3008, and at least a portion of thelengthwise candidate edge 3010 unblocked from viewing by the camera3200.

In an operation, the method 5000 includes an operation 5016 forobtaining subsequent image information of the one or more objects 3000.The subsequent image information is obtained to identify the movement(occurring in operation 5014) of the one or more objects 3000,specifically the target object 3000A. The subsequent image informationincludes information representing the altered positions of the one ormore objects after the operation 5014. Similarly to operation 5002, thesubsequent image information is gathered or captured by the camera 3200,as shown in FIG. 3A. The operation 5016, in capturing the supplementalimage information, may include any or all of the methods and techniquesdiscussed above with respect to operations 4002 and 5002. Further, asdiscussed above, the previously executed robotic movement commands(positioning, grasping, movement) may be executed so as to leave thenon-occlusion area unblocked or at least partially unblocked, asdescribed above.

Further, capturing the supplemental image information may be performedafter or during execution of the above described robotic movementcommands. For example, the supplemental image information may becaptured after execution of the motion command. In another example, thesupplemental image information may be captured during execution of themotion command. Supplemental image information captured after completionof the motion command may be referred to as supplemental still imageinformation while supplemental image information captured duringexecution of the motion command may be referred to as supplementalmotion image information. The operation 5016 described herein may becompleted by the at least one processing circuit of the computingsystem.

In an operation, the method 5000 includes an operation 5018 forestimating dimensions of the target object 3000A based on thesupplemental image information. FIG. 6B depicts subsequent imageinformation after the target object 3000A is dragged in accordance withoperation 5014. The operation 5018 for estimating dimensions of thetarget object 3000A may use supplemental still image information,supplemental motion image information, or both.

Estimating dimensions of the target object 3000A according to thesupplemental still image information includes detecting the presence ofgaps 3026 between the target object 3000A and any adjacent objects 3000depicted within the subsequent image information after completion of therobotic movements associated with the movement command. The subsequentimage information is analyzed at the non-occlusion area 3027, or, asshown in FIG. 6A and FIG. 6B, the strips 3028A-3028D, for the presenceof gaps 3026. Gaps 3026 may be detected from the supplemental imageinformation according to any of the image analysis techniques (e.g.,edge detection, point cloud analysis, etc.) described herein. A gap 3026indicates a boundary or edge of the target object 3000A being separatedfrom the boundary or edge of the neighboring object. In FIG. 6B, thetarget object 3000A is shown as being separated from object 3000B and3000D, creating the gaps 3026A and 3026B respectively.

Each gap 3026A, 3026B, after identification, may be additionallyanalyzed to measure its width and determine whether the gap widthexceeds a gap width threshold. The gap width threshold may be setbetween 1 cm and 3 cm, between 1.5 cm and 2.5 cm, or at approximately 2cm. Other gap width thresholds may be used as appropriate. Inembodiments, the gap width threshold may be selected according to thesize of objects being analyzed (larger objects having a largerthreshold) or according to capabilities of the cameras and other sensorsinvolved in the system (more accurate cameras and sensors permittingsmaller thresholds). For example, a gap 3026 with a width exceeding thegap width threshold may be determined to correspond to an edge of anobject 3000, while a gap 3026 that does not exceed the gap widththreshold may be dismissed as caused by imaging noise, vibrations, orother inaccuracies. In this example, strips 3028A-3028D of thenon-occlusion area 3027 may be analyzed by the computing system 3100 soas to indicate or detect whether a gap 3026 exceeding the gap widththreshold is present. Referring to FIG. 6B, strips 3028A and 3028Cencompass gap 3026A, while strips 3028B and 3028D encompass gap 3026B.Once gaps 3026 exceeding the gap width threshold are detected, they maybe used to define the physical edges of both the target object 3000A,and the neighboring objects 3000. The physical edges identified based onthe gaps 3026 may be additionally used to differentiate or identify theobjects 3000 from one another as well as estimate the dimensions of thetarget object 3000A.

Identification of the gaps 3026 permits the computing system 1100/3100to determine the true physical edges of the target object 3000A. Basedon the true physical edges of the target object 3000A, the computingsystem 1100/3100 may estimate the dimensions of the target object 3000A.

In embodiments, the operation 5018 for estimating dimensions of thetarget object 3000A may operate to detect and analyze gaps 3026 and themovement of objects 3000 during the robotic motion caused by executionof the movement command. The operation 5018 described herein may becompleted by the at least one processing circuit of the computingsystem. Such analysis may be based on supplemental motion imageinformation captured during operation 5016. The gap detection accordingto the supplemental motion image information may be used to supplementor replace the gap detection according to the supplemental still imageinformation, as described above. The computing system 1100/3100 mayperform such gap and object motion detection based on identifyingclusters of points positioned within the strips 3028A-3028D. Thecomputing system 1100/3100 may search along the strips 3028A-3028Dduring the movement command caused robotic motion to detect whetherpoints, for example points in a point cloud, are moving in conjunctionwith either the dragging or lifting motion of the object 3000A.

The points that are tracked may coincide with locations on a surface3001 of any of the objects 3000. The computing system 3100 may groupidentified points that are moving in the same direction and by the sameamount into clusters. The clusters may be used to identify objects anddifferentiate between adjacent objects based on the movement of theclusters. The movement each cluster may be compared to the movement ofeach of the other clusters. If two or more clusters display comparablemovement, each of the clusters may be associated with the same object.For example, a first cluster of points may be associated with thephysical edge 3013 of the target object 3000A and a second cluster ofpoints may be associated with the widthwise physical edge 3015 of thetarget object 3000A. In such an example, the movement of both the firstcluster and the second cluster will be comparable and, therefore, thecomputing system 3100 may associate the clusters with the same targetobject 3000A.

The information gathered during the above-described cluster tracking maybe used in two ways. First, the cluster tracking may be used in astandalone fashion to identify gaps 3026 between objects and identifythe true physical edges of objects. For example, four perpendicularedges showing similar motion may be understood as the true physicaledges of an object. Second, the cluster tracking may be used tosupplement information obtained by the gap detection methods based onthe supplemental still image information. For example, a portion of anobject may have similar visual characteristics as a gap. Accordingly,that portion of the object may be falsely identified as a gap accordingto the supplemental still image information. However, if that portion ofthe object coincides with the one of the strips 3028A-3028D that isimaged and monitored during execution of the movement command, it may bedetermined that all point clusters associated with the false gap move ina similar fashion. Thus, the false gap may be identified as falsebecause, were the gap real, some of the clusters associated with the gapwould show no movement because those clusters would be points on thesurfaces of the adjacent, unmoving objects.

In an operation, the method 5000 may include an operation 5020 fortransmitting a release command for causing the end effector apparatus3330 of the robot 3300 to release the target object 3000A. The operation5020 is an optional operation that may be executed if the estimateddimensions of the target object 3000A exceeds a tolerance thresholdcompared to the minimum viable region 3006. In operation 5012, asdiscussed above, the robotic arm 3320 is commanded to grab the targetobject 3000A based on the minimum viable region 3006 defined atoperation 5006. As noted above, the minimum viable region 3006 definedat operation 5006 may be chosen based on a potential candidate minimumviable region having a smallest area. As stated above regardingoperation 5014, a case exists where the minimum viable region 3006underestimates the dimensions of the target object 3000A. In such acase, the robotic arm 3320 of the robot 3300 may grip the object 3000 inan unstable way, for example, too close to an edge or off centered,which may cause damage to the target object 3000A, the environment, orthe robot 3300 itself.

For example, the target object 3000A may have two flaps that are coupledtogether using a piece of tape. During the operation 5006, the computingsystem 1100/3100 may improperly consider the gap between the two flapsof the single target object 3000A as an edge of the target object 3000A,resulting in a minimum viable region 3006 that may not be securelygrasped by the robotic arm 3320 of the robot 3300. In such a case, theoff centered gripping may cause the robotic arm 3320 of the robot 3300to inadvertently rip open the flap. Based on the size discrepancy, asdiscussed above with respect to operation 5014, the computing system1100/3100 would have instructed the robotic arm 3320 of the robot 3300to drag and not lift the object, reducing the risk of damage.

After more accurately estimating the dimensions of the target object3000A, the computing system 1100/3100 may determine whether or not torelease and regrasp the object. The minimum viable region 3006 may becompared to the estimated dimensions of the target object 3000A. If thedifference between the minimum viable region 3006 and the estimateddimensions of the target object 3000A exceeds a tolerance threshold, therobotic arm 3320 of the robot 3300 can be commanded to release thetarget object 3000A. The tolerance threshold may be chosen based onratio of the minimum viable region 3006 to the estimated dimensions ofthe target object 3000A and/or the weight of the target object 3000A. Inparticular, the tolerance threshold may be set so as to be exceeded whenthe estimated dimensions of the target object 3000A are significantlylarger than the minimum viable region 3006. Some examples of thetolerance thresholds include: the estimated dimensions of the targetobject 3000A being more than 1.25×, 1.5×, 2×, 2.5×, or 3×the size of theminimum viable region 3006. If the comparison yields a result outside ofthe tolerance threshold and the release command is transmitted, theoperation 5022 for regrasping the target object 3000A, as describedbelow, is performed. If the comparison yields a result within thetolerance threshold, the release command is not transmitted and theoperation 5022 is skipped and operation 5024 as described below isperformed. The operation 5020 described herein may be completed by theat least one processing circuit of the computing system.

In an operation, the method 5000 includes an operation 5022 fortransmitting a regrasping command for the robotic arm 3320 of the robot3300 to re-grab the target object 3000A within the estimated dimensionsof the target object 3000A. The regrasping command may be executed afterthe robotic arm 3320 of the robot 3300 has released the target object3000A. The operation 5022 may occur when operation 5020 determines thatthe tolerance threshold is exceeded, and, subsequently, the robotic arm3320 of the robot 3300 has released the object 3000A. Once the targetobject 3000A is released, the robotic arm 3320 of the robot 3300 isrepositioned and instructed to grab the target object 3000A again basedestimated dimensions of the target object 3000A. For example, therobotic arm 3320 may be instructed to grab the target object 3000A at ornear an estimated center of the target object 3000A. The additional stepof re-grabbing the target object 3000A ensures a more stable grip priorto lifting the target object 3000A. The operation 5022 described hereinmay be completed by the at least one processing circuit of the computingsystem.

In an operation, the method 5000 includes an operation 5024 fortransmitting a transfer command to the robotic arm 3320. The transfercommand may be transmitted in two situations. First, if it is determinedat operation 5018 that a release command is necessary (e.g., theestimated dimensions of the target object exceed a threshold relative tothe minimum viable region), the transfer command is transmitted afterthe robotic arm 3320 is determined to be securely grasping the targetobject 3000A as a result of the operation 5022. Second, if it isdetermined at operation 5018 that a release command is not necessary(e.g., the estimated dimensions of the target object do not exceed athreshold relative to the minimum viable region), the transfer commandmay be transmitted without transmission of the release command. Thecomputing system 1100/3100 is configured for transmitting the transfercommand for the robotic arm 3320 to lift the target object 3000A andtransfer the target object to a destination. By lifting the object3000A, additional information regarding the weight and dimension of thetarget object 3000A may be ascertained. The weight of the target object3000A may be used to alter the instructions provided by the computingsystem 1100/3100 to the robot 3300 in further operations, and the weightmay be used to assist in classifying and identifying the target object3000A. The operation 5024 described herein may be completed by the atleast one processing circuit of the computing system.

In embodiments, the system may perform a de-palletization operationbased on the methods 4000 and 5000. The de-palletization operation mayinvolve iteratively executing the methods 4000 and 5000 to identify andestimate the dimensions of object on a pallet so as to safely andsecurely perform a de-palletization operation. Each object is identifiedand assessed according to the MVR and dimension estimation techniquesdescribed herein so that transport of the object can be achieved withoutdamage to the object, the environment, and/or the robot arm. Methodsdescribed herein may be particularly useful in situations where it isnecessary to identify the size, shape, and weight of pallet objects asthey are de-palletized.

It will be apparent to one of ordinary skill in the relevant arts thatother suitable modifications and adaptations to the methods andapplications described herein can be made without departing from thescope of any of the embodiments. The embodiments described above areillustrative examples and it should not be construed that the presentinvention is limited to these particular embodiments. It should beunderstood that various embodiments disclosed herein may be combined indifferent combinations than the combinations specifically presented inthe description and accompanying drawings. It should also be understoodthat, depending on the example, certain acts or events of any of theprocesses or methods described herein may be performed in a differentsequence, may be added, merged, or left out altogether (e.g., alldescribed acts or events may not be necessary to carry out the methodsor processes). In addition, while certain features of embodiments hereofare described as being performed by a single component, module, or unitfor purposes of clarity, it should be understood that the features andfunctions described herein may be performed by any combination ofcomponents, units, or modules. Thus, various changes and modificationsmay be affected by one skilled in the art without departing from thespirit or scope of the invention as defined in the appended claims.

Embodiment 1 is a computing system comprising a non-transitorycomputer-readable medium; at least one processing circuit incommunication with a camera having a field of view and configured, whenone or more objects are or have been in the field of view, to executeinstructions stored on the non-transitory computer-readable medium for:obtaining initial image information of one or more objects, wherein theinitial image information is generated by the camera; detecting aplurality of corners of the one or more objects based on the initialimage information; identifying a target open corner of a target objectfrom the plurality of corners; defining a minimum viable region (MVR)for the target object; defining a non-occlusion area based on theminimum viable region; transmitting a positioning command forpositioning the arm of the robot; transmitting a grasping command forgrabbing the target object within the minimum viable region;transmitting a movement command for moving the target object based on amovement direction, a movement distance, and a movement type, using thearm of the robot; obtaining supplemental image information of the one ormore objects; and calculating estimated dimensions for the target objectbased on the supplemental image information, wherein at least one of thepositioning command, the grasping command, and the movement command areconfigured to prevent the arm of the robot from blocking a non-occlusionarea of the one or more objects.

Embodiment 2 is the computing system of embodiment 1, wherein the atleast one processing circuit is further configured for transmitting arelease command for causing the end-effector of the robot to release thetarget object if the area defined by the estimated dimensions exceeds atolerance threshold compared to the minimum viable region.

Embodiment 3 is the computing system of embodiment 2, wherein the atleast one processing circuit is further configured for transmitting aregrasping command for the end-effector of the robot to grab the targetobject within the estimated dimensions.

Embodiment 4 is the computing system of any of embodiments 1-3, whereinthe at least one processing circuit is further configured fortransmitting a transfer command for the arm of the robot to transfer thetarget object if the estimated dimensions are within a tolerancethreshold compared to the minimum viable region.

Embodiment 5 is the computing system of any of embodiments 1-4, whereindefining the minimum viable region for the target open corner includesdefining an intersection corner opposing the target open corner.

Embodiment 6 is the computing system of any of embodiments 1-5, whereindefining the minimum viable region further includes identifying physicaledges of the target object.

Embodiment 7 is the computing system of any of embodiments 1-6, whereindefining the minimum viable region for the target open corner includesdefining a first candidate edge extending from the intersection cornerin a first direction and defining second candidate edge extending fromthe intersection corner in a second direction, substantiallyperpendicular to the first direction.

Embodiment 8 is the computing system of any of embodiments 1-7, whereinthe non-occlusion area includes the intersection corner and at least aportion of the first candidate edge and the second candidate edge.

Embodiment 9 is the computing system of any of embodiments 1-8, whereinthe at least one processing circuit is further configured fortransmitting the positioning command, the grasping command, and themovement command such that the arm of the robot does not block thenon-occlusion area while the supplemental image information is obtained.

Embodiment 10 is the computing system of any of embodiments 1-9, whereinthe at least one processing circuit is further configured to detect atleast one gap between the target object and objects adjacent to thetarget object based on the supplemental image information.

Embodiment 11 is the computing system of any of embodiments 1-10,further comprising: identifying first physical edges of the targetobject from the initial information; identifying second physical edgesof the target object based on the at least one gap, wherein calculatingthe estimated dimensions of the target object is performed according tothe first physical edges and the second physical edges.

Embodiment 12 is the computing system of any of embodiments 1-11,wherein the movement distance and the movement direction are determinedbased on avoiding collision with the objects adjacent to the targetobject.

Embodiment 13 is the computing system of any of embodiments 1-12,wherein obtaining the supplemental image information is performed duringmovement of the robotic arm caused by execution of the movement command.

Embodiment 14 is the computing system of any of embodiments 1-13,wherein the at least one processing circuit is further configured foridentifying physical edges of the target object according to comparablemovement during movement of the robotic arm caused by execution of themovement command.

Embodiment 15 is the computing system of any of embodiments 1-14,wherein the at least one processing circuit is further configured todetermine the movement type as lifting movement or dragging movementbased on a comparison between the MVR and a maximum candidate size ofthe target object.

Embodiment 16 is a method of controlling a robotic system comprising anon-transitory computer-readable medium, at least one processing circuitin communication with a camera having a field of view and configured toexecute instructions, the method comprising: obtaining initial imageinformation of one or more objects, wherein the initial imageinformation is generated by the camera; detecting a plurality of cornersof the one or more objects based on the initial image information;identifying a target open corner of a target object from the pluralityof corners; defining a minimum viable region (MVR) for the targetobject; defining a non-occlusion area based on the minimum viableregion; transmitting a positioning command for positioning the arm ofthe robot; transmitting a grasping command for grabbing the targetobject within the minimum viable region; transmitting a movement commandfor moving the target object based on a movement direction, a movementdistance, and a movement type, using the arm of the robot; obtainingsupplemental image information of the one or more objects; andcalculating estimated dimensions for the target object based on thesupplemental image information, wherein at least one of the positioningcommand, the grasping command, and the movement command are configuredto prevent the arm of the robot from blocking a non-occlusion area ofthe one or more objects.

Embodiment 17 is the method of embodiment 16, wherein defining theminimum viable region for the target open corner includes: defining anintersection corner opposing the target open corner; identifyingphysical edges of the target object; defining a first candidate edgeextending from the intersection corner in a first direction; anddefining second candidate edge extending from the intersection corner ina second direction, substantially perpendicular to the first direction.

Embodiment 18 is the method of embodiment 16, further comprising:detecting at least one gap between the target object and objectsadjacent to the target object based on the supplemental imageinformation; identifying first physical edges of the target object fromthe initial information; and identifying second physical edges of thetarget object based on the at least one gap, wherein calculating theestimated dimensions of the target object is performed according to thefirst physical edges and the second physical edges.

Embodiment 19 is the method of embodiment 16, wherein obtaining thesupplemental image information is performed during movement of therobotic arm caused by execution of the movement command, the methodfurther comprising identifying physical edges of the target objectaccording to comparable movement during movement of the robotic armcaused by execution of the movement command.

Embodiment 20 is a non-transitory computer-readable medium includinginstructions for execution by at least one processing circuit incommunication with a camera having a field of view and configured, whenone or more objects are or have been in the field of view, theinstructions being configured for: obtaining initial image informationof one or more objects, wherein the initial image information isgenerated by the camera; detecting a plurality of corners of the one ormore objects based on the initial image information; identifying atarget open corner of a target object from the plurality of corners;defining a minimum viable region (MVR) for the target object; defining anon-occlusion area based on the minimum viable region; transmitting apositioning command for positioning the arm of the robot; transmitting agrasping command for grabbing the target object within the minimumviable region; transmitting a movement command for moving the targetobject based on a movement direction, a movement distance, and amovement type, using the arm of the robot; obtaining supplemental imageinformation of the one or more objects; and calculating estimateddimensions for the target object based on the supplemental imageinformation, wherein at least one of the positioning command, thegrasping command, and the movement command are configured to prevent thearm of the robot from blocking a non-occlusion area of the one or moreobjects.

1. A computing system comprising: at least one processing circuit incommunication with a robot, having an arm and an end-effector connectedthereto, and a camera having a field of view and configured, when one ormore objects are or have been in the field of view, to executeinstructions stored on a non-transitory computer-readable medium for:obtaining initial image information of one or more objects, wherein theinitial image information is generated by the camera; detecting aplurality of corners of the one or more objects based on the initialimage information; identifying a target open corner of a target objectfrom the plurality of corners; defining a minimum viable region (MVR)for the target object; defining a non-occlusion area based on theminimum viable region; transmitting a positioning command forpositioning the arm of the robot; transmitting a grasping command forgrabbing the target object within the minimum viable region;transmitting a movement command for moving the target object based on amovement direction, a movement distance, and a movement type, using thearm of the robot; obtaining supplemental image information of the one ormore objects; and calculating estimated dimensions for the target objectbased on the supplemental image information, wherein at least one of thepositioning command, the grasping command, and the movement command areconfigured to prevent the arm of the robot from blocking thenon-occlusion area of the one or more objects.
 2. The computing systemof claim 1, wherein the at least one processing circuit is furtherconfigured for transmitting a release command for causing theend-effector of the robot to release the target object if the areadefined by the estimated dimensions exceeds a tolerance thresholdcompared to the minimum viable region.
 3. The computing system of claim2, wherein the at least one processing circuit is further configured fortransmitting a regrasping command for the end-effector of the robot tograb the target object within the estimated dimensions.
 4. The computingsystem of claim 1, wherein the at least one processing circuit isfurther configured for transmitting a transfer command for the arm ofthe robot to transfer the target object if the estimated dimensions arewithin a tolerance threshold compared to the minimum viable region. 5.The computing system of claim 1, wherein defining the minimum viableregion for the target open corner includes defining an intersectioncorner opposing the target open corner.
 6. The computing system of claim5, wherein defining the minimum viable region further includesidentifying physical edges of the target object.
 7. The computing systemof claim 6, wherein defining the minimum viable region for the targetopen corner includes defining a first candidate edge extending from theintersection corner in a first direction and defining second candidateedge extending from the intersection corner in a second direction,substantially perpendicular to the first direction.
 8. The computingsystem of claim 7, wherein the non-occlusion area includes theintersection corner and at least a portion of the first candidate edgeand the second candidate edge.
 9. The computing system of claim 1,wherein the at least one processing circuit is further configured fortransmitting the positioning command, the grasping command, and themovement command such that the arm of the robot does not block thenon-occlusion area while the supplemental image information is obtained.10. The computing system of claim 1, wherein the at least one processingcircuit is further configured to detect at least one gap between thetarget object and objects adjacent to the target object based on thesupplemental image information.
 11. The computing system of claim 10,further comprising: identifying first physical edges of the targetobject from the initial image information; identifying second physicaledges of the target object based on the at least one gap, whereincalculating the estimated dimensions of the target object is performedaccording to the first physical edges and the second physical edges. 12.The computing system of claim 1, wherein the movement distance and themovement direction are determined based on avoiding collision with theobjects adjacent to the target object.
 13. The computing system of claim1, wherein obtaining the supplemental image information is performedduring movement of the arm of the robot caused by execution of themovement command.
 14. The computing system of claim 13, wherein the atleast one processing circuit is further configured for identifyingphysical edges of the target object according to comparable movementduring movement of the arm of the robot caused by execution of themovement command.
 15. The computing system of claim 1, wherein the atleast one processing circuit is further configured to determine themovement type as lifting movement or dragging movement based on acomparison between the MVR and a maximum candidate size of the targetobject.
 16. A method of controlling a robotic system comprising anon-transitory computer-readable medium, at least one processing circuitin communication with a camera having a field of view and configured toexecute instructions, the method comprising: obtaining initial imageinformation of one or more Objects, wherein the initial imageinformation is generated by the camera; detecting a plurality of cornersof the one or more objects based on the initial image information;identifying a target open corner of a target object from the pluralityof corners; defining a minimum viable region (MVR) for the targetobject; defining a non-occlusion area based on the minimum viableregion; transmitting a positioning command for positioning an arm of arobot; transmitting a grasping command for grabbing the target objectwithin the minimum viable region; transmitting a movement command formoving the target object based on a movement direction, a movementdistance, and a movement type, using the arm of the robot; obtainingsupplemental image information of the one or more objects; andcalculating estimated dimensions for the target object based on thesupplemental image information, wherein at least one of the positioningcommand, the grasping command, and the movement command are configuredto prevent the arm of the robot from blocking a non-occlusion area ofthe one or more objects.
 17. The method of claim 16, wherein definingthe minimum viable region for the target open corner includes: definingan intersection corner opposing the target open corner; identifyingphysical edges of the target object; defining a first candidate edgeextending from the intersection corner in a first direction; anddefining second candidate edge extending from the intersection corner ina second direction, substantially perpendicular to the first direction.18. The method of claim 16, further comprising: detecting at least onegap between the target object and objects adjacent to the target objectbased on the supplemental image information; identifying first physicaledges of the target object from the initial information; and identifyingsecond physical edges of the target object based on the at least onegap, wherein calculating the estimated dimensions of the target objectis performed according to the first physical edges and the secondphysical edges.
 19. The method of claim 16, wherein obtaining thesupplemental image information is performed during movement of therobotic arm caused by execution of the movement command, the methodfurther comprising identifying physical edges of the target objectaccording to comparable movement during movement of the arm of the robotcaused by execution of the movement command.
 20. A non-transitorycomputer-readable medium including instructions for execution by atleast one processing circuit in communication with a camera having afield of view and configured, when one or more objects are or have beenin the field of view, the instructions being configured for: obtaininginitial image information of one or more objects, wherein the initialimage information is generated by the camera; detecting a plurality ofcorners of the one or more objects based on the initial imageinformation; identifying a target open corner of a target object fromthe plurality of corners; defining a minimum viable region (MVR) for thetarget object; defining a non-occlusion area based on the minimum viableregion; transmitting a positioning command for positioning an arm of arobot; transmitting a grasping command for grabbing the target objectwithin the minimum viable region; transmitting a movement command formoving the target object based on a movement direction, a movementdistance, and a movement type, using the arm of the robot; obtainingsupplemental image information of the one or more objects; andcalculating estimated dimensions for the target object based on thesupplemental image information, wherein at least one of the positioningcommand, the grasping command, and the movement command are configuredto prevent the arm of the robot from blocking a non-occlusion area ofthe one or more objects.